This www page updates pi movie by also showing the genetic information (in the right hand panel) and by giving more explanation. The same run is described by both animations.

The top screen should show an animation (20Mb) of species of programs evolving, forming demes, spreading and becoming extinct. Each of the 5000 populations of 204800 programs was evaluated on an nVidia GeForce 8800 GTX GPU

Generation 0

The initial generation is made of randomly chosen programs. The log distributions below each panel show random program vary widely in terms of both performance (left) and composition (right).

The best in the population are highlighted by white cross hairs (left). The black dots indicating the best are not overwritten by the crosshairs.

Generation 1

100% selection copies the best of each pair over the weaker candidate solution. In many cases the pairs are not immediately disrupted by crossover or mutation. This leads to an immediate visual roughening of the texture as many adjacent pixels now have similar colours. This is most obvious in the righthand panel.

Note the creation of two red lines at 600 and 650 in the left hand log histogram. These are due to the large number of programs consisting of a single terminal. The ramped half-and-half algorithm used to create random programs (with the setting used here) does not create such tiny programs. Hence they do not appear in generation 0.

The histogram at the bottom of the right hand side shows the number of programs (on a logscale) with a particular genetic distance from the first "solution" found to the problem.

Genetic diversity

I used a simple measure of genetic distance to colour the right panel. First I align the root (output node) of the program with the root of the first solution ("opt"). Then I step through the longer program and at each instruction I calculate the absolute difference between the opcode in opt from the opcode in our program.

The differences are exponentially weighted according to the depth of the primitive relative to the root in the "opt" program and summed. Shorter programs are given bigger distances since missing opcodes are given a default of zero. Hence the two programs composed of single leafs have the two largest distance measures.

Negative distances are reserved to colour the genetic make up of programs which have exactly the same phenotype as "opt" but which are not syntactically identical to it.

The grey region is reserved in order to be able to display programs created by crossing over "opt" with itself. Where the two crossover points are identical then the two offspring will be identical to "opt". Thus they, like it, are displayed in black. The other self crossovers can have the same fitness (phenotype) as their parents and other have worse fitness.

Generations 2-10

Selection tends to reduced diversity in the population as areas of similar programs start to form and expand. Since selection, crossover and mutation are limited to adjacent pixels and these are rotated through the four cardinal directions (up, down, left, right) the maximum spread of innovation is limited to 0.5 per generation.

The two histograms reflect the slow increase in good programs by tending to shrink towards the centre.

Generation 20

Two programs which could have been created by selfcross over are visible as yellow dots (right hand side). The triple white cross-hairs highlight the location of the three programs in the population with the best fitness 3.14157581.

The formation of small demes of similar programs is visible in both phenotypic and genetic spaces.

Generation 30

The best program continues to spread.

Generation 40

The best program continues to spread and red and yellow spots representing children produced by recombination of copies of opt (6) are visible nearby (right-hand side).

The demes of similar programs continue to grow in size and decrease in number. This is particularly obvious when looking at the programs' chromosomes.

Generation 200

The solution deme continues to grow. Of the 1380 16bit solutions in the population, 472 are exact copies of opt. These are plotted with the black line at zero in the center of the genetic distance histogram. About a third of the population are within 10-4 of the solution. These are plotted in black in the phenotype space.

Note the solution deme is not homogenous. The genetic operations ensure that even the best programs produce diverse children. This diversity produces a red and yellow speckle of pixels amongst the black (representing opt) in the genetic space (right handside). In phenotype space (left) the edges of the solution deme are highlighted by white crosshairs. (Remember crosshairs are not plotted on top of the pixels with the best fitness in the population.)

The more succesfull demes continue to grow and force other species into extinction.

Generation 300

The right histogram shows the appearance of rival solutions within the solution deme. These are syntactically different from opt but have identical performance.

Generation 500

The rival species of solutions has consolidated within the solution deme. However, like the other sucessfull demes, it still continues to grow. In fitness space the region occupied by opt continues to be highlighted by white cross hairs. But note these are absent from the area occupied by the new solution species, indicating it occupies the space more densely allowing less room for non-solutions and hence no room for the crosshairs.

From generation 300 onwards, about half of the population remain more than 10-4 away from the solution. and both genetic and phenotypic histogram show only marginal variations.

Generation 600

By generation 600 the population has clearly divided into three dominant species. These 3 continue to grow and engulf smaller rivals. Just looking at the fitness space (left), it is not clear if the reddish deme or the greener one will win out over the the other. However we would expect the solution deme to continue to grow and engulf the others.

The genotypic picture reveals the continued expansion of rival solution species within the solution deme. Indeed at least one other rival solution species can (yellow-green) has become established. The numbers of the rivals are plotted in the same colour in the right hand histogram.

Generation 1100

The solution deme has continued to grow and will clearly extinguish its two remaining rivals. However the space occupied by the original opt program and its children has remained about the same thickness. The bulk of the solution deme is occupied by rival species. At least three species have formed.

Generation 1600

The reddish deme succeeds in driving out the third rival deme but its obvious that it will shortly be engulfed by the solution deme. Notice that the newer solution species continue to drive forward and at the end of the world have over thrown the protective shell of copies of the original opt program. The number of copies of opt and its children falls.

Generation 1800

The original second deme is all but gone and so too is opt.

Generation 2100

While only the new solution species remain, they actually only occupy about half the pixels. The other half is still occupied by their poorly performing offspring.

The lack of diversity in the breeding population is responsible for the stronger patterns seen in both the distribution of phenotype and (to a lesser extent) genotypes.

Generation 5000

The population as of generation 2,100 is very stable. The only major change visible more than 2000 generations latter is the extinction of the third (more yellow-green) of the new solution species.

Effective Fitness

W.Langdon xx.essex.ac.uk 27 May 2004