Mathematical and Biological Sciences

Essex University

Presented at CAMDA-2007

- Examples of spatial defects
- NCBI GEO and E-TABM-185 human GeneChips
- Non-parametric detection of spatial flaws
- Error rate for each GeneChip type
- Where next

Quantile normalised log scaled Red suspiciously bright

- Are group of probes all above average?
- All below average?
- 3 by 3 checker board.
- Count number of coloured squares which have same sign as centre.
- For statistical independence ignore vertical and horizontal neighbours.
- Chequer board used to give statistical independence by ensuring don't use both perfect match (PM) and mismatch (MM) probes.
- 50% above average.
- p(all above average) = 1/32
- Millions of probes, 1/32 too often by chance. Hierarchical test. See if adjacent 3x3 are also all high (or all low)

- 9 3x3 checker boards.
- If 3 or more 3x3 neighbours are above average, Flag central 3x3 as being suspicious.
- P(≥3) <0.01%

Red above average.

Controls suppressed

Red more than 2 fold more than expected. Max 17 times average.

Controls suppressed

Controls suppressed

- More than 15000 GeneChips analysed.
- All published human chips have errors.
- Uneven error rate. Some probes 28%
- Error rate depends on chip type but has fallen.
- Method for detecting spatial defects. Applied to chips one at a time.
- Technique is non-parametric, does not assume Gaussian or other distributions, is statistically sound, fast and in R. (6000 HG-U133A processed in 5hours 18minutes on desk top Linux computer).

W.Langdon 17 Dec 2007