W B Langdon Non-GP 2014 Abstracts

W.B.Langdon . 6 October 2015 2014 papers , full list

Computational Intelligence and Testing

W. B. Langdon, Technical Report RN/14/13


Discussion of the state of the art in and future research on Software Testing during NII Shonan Meeting Computational Intelligence for Software Engineering, Seminar 053, 20-23 October 2014. The discussions between software engineers and experts in artificial intelligence were mainly lead by Andreas Zeller and Jens Krinke

Mycoplasma Contamination in The 1000 Genomes Project

Here, there, and everywhere: From PCRs to next generation sequencing technologies and sequence databases, DNA contaminants creep in from the most unlikely places. By Karl Gruber | 6 July, 2015 Mistaken Identities: Researchers are working to automate the arduous task of identifying and amending mislabeled sequences in genetic databases. By Kerry Grens | January 1, 2015 Sneaky Bacteria Impersonate as Humans. Yellow Germ 1000 Genomes A Deep Catalogue of Human Genetic Variation Picture: Fotolia/Albert Ziganshin A 1000 human genomes...and some mycoplasma too Contamination hits cell work  Genohub High Throughput Sequencing Blog Thoughts, News and Ideas on the Next-Gen Sequencing Market Microbiome Digest Bik's Picks Contamination hits cell work currents in Biology
Image credit: Webridge, Wikimedia Commons blog gmahommody sc Australasian Epidemiologist bBioRSS Kazuhiro Takemoto citeulike:13154271 MJ Alexander Grossmann Keith Robison inagainst MJ Jason Corneveaux Jason Corneveaux lookfordiagnosis.com SC Carlos Esteban PhD Alex Goglia Jason Corneveaux Darren N. Nesbeth science gate The RNomics #Paper Daily BIBLIOTECA DE RECURSOS inagainst Daniel Gerlach Medworm: Biology - Search Helix I/O Pierre Poulain Haruka Ozaki Hatena::Antenna wildau highbean busness Scott Givan Kenzibit 29 apr 2014 Springer Link single-cell analysis highqu.com 1000 Genomes FAQ (Frequently Asked Questions) Is the 1000 Genomes sequencing data contaminated with mycoplasma? fsm Springer Link computer science David Lusseau Health Medicinet Network KI_Lipid_metabo Silvia Paracchini Catherine Kerr equipu4 Postgraduate Forum on Genetics and Society Selected Article Kazuhiro Takemoto Francesca V. Ponce ALBERT All Library Books, journals and Electronic Records Teleg nailest in against in against mycoplasma contamination keywords:Data mining keywords:Data mining Dupont Pierre-Yves wildau Image credit: Webridge, Wikimedia Commons Pierre Lindenbaum Jason Corneveaux facebook labtimes facebook social plugin lab routes Monica Munoz-Torres PMC4022254 images from this publication #mycoplasma Biome #genomes Biome Guillaume Nicolas Chris Willmott PFGS Committee Monica Munoz-Torres Thiago Britto-Borges Ralf Reski facebook labtimes #biodatamining #Genomes #Mycoplasma Altmetric investasi saham Centre de Ressources Biologiques The Tannosome 29 apr 2014 Bradford Condon Surya Saha Mycozombies of The Last of Us Laura Wiley Heather Vincent Eberhard Dietze Page 2 RSS BioPortfolio EJBiotechnology rssing Joana Ribeiro researchgate Saumyadip Sarkar Unbound Medline Alexandros Sfakianakis nuzzel PhD Tree Bio Med Search.com Genetic Engineering and Biotech News DBLP computer science bibliography 7 things you should definitely not eat 7 things you should definitely not eat scienceopen.com Read by QxMD is in beta Europe PubMed Central Dr Madhura Vipra Tag genome [more than 800 articles] dancing doggies copy? Tag 1000-genomes Alexandros G.Sfakianakis John Hogenesch Alex Reis Paulo Nuin free search BioPortfolio Bio Science JournalTOCs facebook labtimes Reverb Who are the contaminants in your sequencing project? Jonathan Eisen Medic Finder an agreegate news feed Blog of Jonathan Eisen Jisc Publications Router One Codex John Hogenesch Copy of Alex's article Copy of Alex's article readcube scientific publication data ¥Ð¥¤¥ª¥Þ¡¼¥±¥Ã¥Èjp pubzone You might want to see more details about: William B. Langdon Heba Kassem M.D. Ph.D. Manchester, UK Alexandria University, Department of Pathology-Clinical G Abstract is missing. Current Biology - Contamination hits cell work (Nature News) The... Kerry Grens News For Today Morsels For The Mind 02/01/2015 Roy Crockett Mistaken Identities By Kerry Grens Mistaken Identities - Scientist (blog) Contamination hits cell work. Contamination hits cell work. Contamination hits cell work. culture friday Nucleic Acids Research Volume 43, Issue 5 Pp. 2535-2542 Unbound MEDLINE Standards in Genomic Sciences 2015, 10:18  doi:10.1186/1944-3277-10-18 Sparrho Aminer zotero Alexandros G.Sfakianakis Maqsud Hossain Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks. Malcolm Campbell: Mistaken Identities By Kerry Grens The Great Big Clean-Up 1379

W. B. Langdon, BioData Mining, 2014 7(3). Draft doi:10.1186/1756-0381-7-3 Article Metricsaccess stats

Press Release


Genes Jump Silicon Barrier

Pictures etc. for "In Silico Infection of the Human Genome".



In silco Biology is increasingly important and is often based on public data. While the problem of contamination is well recognised in microbiology labs the corresponding problem of database corruption has received less attention [1 More Mouldy Data In Silico Infection of the Human Genome].


Mapping 50 billion next generation DNA sequences from The Thousand Genome Project against published genomes reveals many that match one or more Mycoplasma but are not included in the reference human genome GRCh37.p5. Many of these are of low quality but NCBI BLAST searches confirm some high quality, high entropy sequences match Mycoplasma but no human sequences.


It appears at least 7% of 1000G samples are contaminated.


Molecular Biology, Microbiology, genetics, metagenomic, Data mining, Next-generation DNA sequencing, Data cleansing, High Throughput, Solexa, 454, SOLiD.

Schematic showing major data flows in Mycoplasma analysis of The Thousand Genome Project (top color). A random sample of next generation scan are copied across the Internet to the computer at UCL (black). Bowtie [1] is used to extract individual and paired-end DNA measurements which match one or more of the published Mycoplasma genomes. Bowtie is used a second time to exclude DNA measurements which match the reference human genome, leaving 75879 Mycoplasma DNA measurements from 2055 scans of the 4058 downloaded.

Cites include

cover Nature 27 August 2015 Endosymbiotic origin and differential loss of eukaryotic genes, Chuan Ku, Shijulal Nelson-Sathi, Mayo Roettger, Filipa L. Sousa, Peter J. Lockhart, David Bryant, Einat Hazkani-Covo, James O. McInerney, Giddy Landan & William F. Martin, Nature 524, 427-432 doi:10.1038/nature14963 PMID: 26287458

Genome Res. July 2015 Using populations of human and microbial genomes for organism detection in metagenomes, Sasha K. Ames, Shea N. Gardner, Jose Manuel Marti, Tom R. Slezak, Maya B. Gokhale, Jonathan E. Allen, Genome Res. 2015 July; 25(7): 1056-1067. doi: 10.1101/gr.184879.114 PMCID: PMC4484388

Standards in Genomic Sciences 2015, 10:18 Large-scale contamination of microbial isolate genomes by Illumina PhiX control. Supratim Mukherjee, Marcel Huntemann, Natalia Ivanova, Nikos C Kyrpides and Amrita Pati, Standards in Genomic Sciences 2015, 10:18 PMCID: PMC4511556

A Novel Method for Detecting Contaminated Sample Based on Illumina Sequencing Data. International Journal of Bioscience, Biochemistry and Bioinformatics, 4(2) 2014, Zheng Huang and Qibin Li and Wei Jin and Qijun Liao and Xiao Sun http://www.ijbbb.org/papers/322-E0014.pdf

cover NAR July 2015 Assessing the prevalence of mycoplasma contamination in cell culture via a survey of NCBI's RNA-seq archive, Anthony O. Olarerin-George, John B. Hogenesch Nucleic Acids Res. 2015 March 11; 43(5): 2535-2542. doi: 10.1093/nar/gkv136 PMCID: PMC4357728 Anthony O Olarerin-George, John B Hogenesch

EMBO Press Here, there, and everywhere: From PCRs to next-generation sequencing technologies and sequence databases, DNA contaminants creep in from the most unlikely places, Karl Gruber, EMBO Rep. 2015 August; 16(8): 898-901 PMID: 26150097