Cranfield Bath Bombay East Lansing Nagoya


Paper 33: Fitness Causes Bloat

W. B. Langdon and R. Poli


Re: 'Fitness causes bloat'

Bill Langdon (W.B.Langdon@cs.bham.ac.uk)
Mon, 07 Jul 1997 10:55:51 +0100

[Please add this message to the discussion of paper 33 WBL]

From: James Foster <foster@cs.uidaho.edu>
Date: Wed, 2 Jul 1997 19:15:40 -0700 (PDT)

Bill,

I think I can contribute something to a discussion of real biology here, though
I'm not on the mailing list...so this response goes only to you.

> > I too would like to extend this paper to covering natural evolution,
> > however the paper assumes a static objective function and random
> > genetic operations. While in nature things are vastly more complex and
> > there may well be bias for or against increase in genome length. In my
> > message to Peter Bentley, I suggested a possible parsimony bias in
> > some cases.
> >
> > I wonder if there any evidence (perhaps from the human genome
> > project?) that multi-cellular organism's genomes' are still increasing
> > in size, or have they reached some form of equilibrium?

This question has been investigated a great deal. The bottom line: there is
*no* correlation between complexity of organism and size of genome. There is
*no* correlation even between organisms with a common divergence point. For
example, there are molds with genome sizes much larger than any mammalian
genome. There are reptiles with very large genomes, and some with very small
genomes.

There is also *no* correlation between complexity of organism and the amount of
expressed genetic material.

In biology, this is called the "C value paradox" (if you want to do a literature
search). It is currently one of the very open problems in molecular biology.

I do think that there is clear parsimony pressure in some organisms.
Prokaryotes, for example, place a premium on minimizing the time for
reproduction. To do this, they must have small genomes. Interestingly,
prokaryotes almost *never* have introns. To put this another way (in a way
which is in direct opposition to much of what I hear in the EC community): the
most successful organisms on earth do not have introns. We need to remember that
there are more prokaryotes than eukaryotes, measured either in numbers or in
biomass.

Even more interesting are viruses. These "organisms" absolutely MUST copy their
genetic material quickly. In fact, in viruses there are not only no introns,
but there are primitive forms of data compression. To be precise, some viruses
use multiple open reading frames so that, say, 100 codons can code for up to about
300 proteins.

So, your drift toward larger genome sizes is NOT ubiquitous in nature. As you
correctly point out, this says very little about bloat, since the "evolutionary
strategies" of organisms are very, very diverse and complicated.

The real question is whether the tendency toward bloat is operative in our very
simplified engineering applications. I have some doubts even there. I'm
particularly skeptical of Price's theorem here. Altenberg begins by redefining
"fitness" in the more biological sense as "contributing toward representation in
the progeny". After this redefinition it seems hardly surprising that traits
correlated with "fitness" increase in representation. Price's theorem becomes
far too tautological in the simplified EC world (though it's a big help in
population biology).

My own take is that there are several different sources of drift toward code
bloat that operate at different times in an evolutionary computation. Details
of the EC being implemented determine how strongly and when each of them
operate. It's another caucus race: all the researchers are winners (they're all
right) and they must all have prizes.

Are you going to both GP and ICGA? Terry and I will be at both. I'm looking
forward to chatting with you again!

-- 
James A. Foster			email: foster@cs.uidaho.edu
Laboratory for Applied Logic	Dept. of Computer Science
University of Idaho		http://www.cs.uidaho.edu/~foster

pgp key available at: ftp://ftp.cs.uidaho.edu/pub/foster/pgp-key.asc