Discussion

From the previous sections it is clear that the Ant problem has the features often suggested of real program spaces. The program space is large and, using the simplest neighbour relationship, forms a Karst landscape containing many false peaks and many plateaus riven with deep valleys. It is clear from an analysis of the simplest solutions that there are multiple distinct and conflicting solutions to the problem, some arising from symmetries in the primitive set and some from the problem itself. The landspace is riddled with neutral networks linking programs of the same fitness in a dense and suffocating labyrinth.

A limited analysis of the schema indicates the problem is deceptive at all levels. Longer programs are on average slightly fitter but contain a slightly lower density of solutions. There are hyperspaces which do not contain solutions which are fitter than those of the same length which do. There are low and middle order schema which are required to build solutions but which are below average fitness. Schema typically have a high fitness variance. This means practical sized samples give noisy estimates of their fitness, leading GAs to choose between them randomly. However the fitness of low order schema may be estimated more reliably (as GA populations can contain many instances of them). Where they are deceptive, this may lead a GA to discard them. (Extinction of complete primitives was seen in the list and stack problems [Langdon1998a, Chapter 6 and 8,]). If real program spaces have these characteristics (we expect them to do so but be still worse) then it is important to be able to demonstrate scaleable techniques on such problem spaces. The Santa Fe trail provides a tractable problem for such demonstrations. From Table 3 it is obvious that current techniques are not doing well on it.

We have only considered the simplest solutions using a fixed representation but we have shown they cannot be assembled from small components of above average fitness (i.e. building blocks). Indeed many constructs which a human programmer might use when constructing solutions have below average fitness. However it is possible building blocks of above average fitness which GP uses exist and longer solutions can be constructed from them. Their assembly may be eased by exploiting the variable length of the representation.

Current GP techniques are not exploiting the symmetries of the problem. These lead to essentially the same solutions appearing to be the opposite of each other. Eg. either a pair of Right or pair of Left terminals at a particular location may be important. If the search technique does not recognise them as the same thing it may spend a lot of effort trying to decide between them, when perhaps either would do (cf. ``competing conventions'' in artificial neural networks). A possibly useful approach is to break this symmetry (e.g. by putting more of one primitive in the initial population) to bias the technique so that it chooses one option quickly. The tangled network of programs with same fitness which consumes much machine resources by promoting bloat might also be addressed by introducing a small bias. In the Ant problem we would expect a slight bias in favour of shorter programs to be beneficial as solutions are more frequent when programs are short.

It is clear the Ant problem is essentially difficult because of the large number of local optima. These are created by the combination of the representation, the neighbour operator and the fitness function. While there may be improvements to the representation or better search techniques we should also consider the fitness function, particularly how we reward partial solutions.

Next: Conclusions Up:

Why Ants are Previous: The Solutions

William B Langdon
Tue Feb 3 19:53:29 GMT 1998