W B Langdon's 2003 Abstracts

W.B.Langdon . 9 Jan 2013 2003 papers , full list


Predicting Biochemical Interactions - Human P450 2D6 Enzyme Inhibition

W. B. Langdon and S. J. Barrett and B. F. Buxton. CEC 2003, pages 807-814, 8-12 Dec, Canberra. (PDF, ps.gz). Slides

ABSTRACT

In silico screening of chemical libraries or virtual chemicals may reduce drug discovery and medicine optimisation lead times and increase the probability of success by directing search through chemical space. About a dozen intelligent pharmaceutical QSAR modelling techniques were used to predict IC50 concentration (three classes) of drug interaction with a cell wall enzyme (P450 CYC2D6). Genetic programming gave comprehensible cheminformatics models which generalised best. This was shown by a blind test on GlaxoWelcome molecules of machine learning knowledge nuggets mined from SmithKline Beecham compounds. Performance on similar chemicals (interpolation) and diverse chemicals (extrapolation) suggest generalisation is more difficult than avoiding over fitting.

Two GP approaches, classification via regression using a multi-objective fitness measure and a direct winner takes all (WTA) or one versus all (OVA) classification, are described. Predictive rules were compressed by separate follow up GP runs seeded with the best program.

Bibliographic details


Convergence of Program Fitness Landscapes

W. B. Langdon, GECCO 2003 12-16 July, Chicago. (PDF, ps.gz) LNCS 2724

ABSTRACT

Point mutation has no effect on almost all linear programs. In two genetic programming (GP) computers (cyclic and bit flip) we calculate the fitness evaluations needed using steepest ascent and first ascent hill climbers and evolutionary search. We describe how the average fitness landscape scales with program length and give general bounds.

Bibliographic details


The distribution of Reversible Functions is Normal

W. B. Langdon. Slides presented at GP Workshop on Theory/Practice, 15-17 May 2003, pages 173-187, University of Michigan. PDF ps.gz DOI

ABSTRACT

The distribution of reversible programs tends to a limit as their size increases. For problems with a Hamming distance fitness function the limiting distribution is binomial with an exponentially small chance (but non~zero) chance of perfect solution. Sufficiently good reversible circuits are more common. Expected RMS error is also calculated. Random unitary matrices may suggest possible extension to quantum computing. Using the genetic programming (GP) benchmark, the six multiplexor, circuits of Toffoli gates are shown to give a fitness landscape amenable to evolutionary search. Minimal CCNOT solutions to the six multiplexer are found but larger circuits are more evolvable.

Bibliographic details


Comparison of AdaBoost and Genetic Programming for combining Neural Networks for Drug Discovery

W. B. Langdon and S. J. Barrett and B. F. Buxton. Presented at EvoBIO'2003, 11-14 April 2003, LNCS 2611, Essex, p87-98, Springer-Verlag. PDF ps.gz With the help of a Publication Support Grant from Evolsolve

ABSTRACT

Genetic programming (GP) based data fusion and AdaBoost can both improve in vitro prediction of Cytochrome P450 activity by combining artificial neural networks (ANN). Pharmaceutical drug design data provided by high throughput screening (HTS) is used to train many base ANN classifiers. In data mining (KDD) we must avoid over fitting. The ensembles do extrapolate from the training data to other unseen molecules. I.e. they predict inhibition of a P450 enzyme by compounds unlike the chemicals used to train them. Thus the models might provide in silico screens of virtual chemicals as well as physical ones from GlaxoSmithKline (GSK)'s cheminformatics database. The receiver operating characteristics (ROC) of boosted and evolved ensemble are given.

Bibliographic details


up
W.B.Langdon cs.ucl.ac.uk