W B Langdon Non-GP 2013 Abstracts

W.B.Langdon . 5 March 2014 2013 papers , full list

Mycoplasma Contamination in The 1000 Genomes Project,

W. B. Langdon, Technical Report RN/13/10.


Mapping next generation DNA sequences from the thousand genome project against published genomes reveals many that match one or more Mycoplasma but are not included in the reference human genome GRCh37.p5. Many of these are of low quality but NCBI BLAST searches confirm some high quality, high entropy sequences match Mycoplasma but no human sequences. Suggesting at least 7percent of 1000G samples are contaminated.

Correlation of Microarray Probes give Evidence for Mycoplasma Contamination in Human Studies

W. B. Langdon, GECCO-2013 Workshop MedGEC Medical Applications of Genetic and Evolutionary Computation Stephen L Smith Stefano Cagnoni Robert Patton Editors, pp1447-1454. DOI PDF (slides).
Technical Report RN/12/11.


At least 473 Affymetrix HG-U133 +2 Homosapiens probes match one or more species of mycoplasma. Analysis of published data from thousands of human GeneChips finds correlations in homo sapiens studies between different microbiology laboratories in different countries which suggests contamination with mycoplasma is the common factor. This high lights the problem of experts in evolutionary computation needing to apply due diligence before relying on public medical datasets. Caveat emptor even if the data are free!