Impact of Loops and Memory on GP Fitness Landscapes
W. B. Langdon
Paper presented at EuroGP-2006
- By adding memory and loops to linear GP we make it Turing complete, but
- What is effect of this on fitness landscape?
- T7 computer
- Experiments and results
- Mathematical models
GP is evolving O(1) functions
- Only a few papers published on evolving Turing complete programs.
- Little work on evolving programs with side effects, such as memory
- Little work of evolving recursion or iteration
- Without memory and loops there are programs which GP cannot represent and so cannot evolve.
GP and data structures
evolves GP trees with memory and includes examples of
- Foundations of Genetic Programming shows distribution of GP programs tends to a limit as programs gets bigger
- Also true for Turing complete programs?
- In limit, fraction of halting T7 programs is zero
T7 Minimal Turing Complete CPU
- 7 instructions
- Arithmetic unit is ADD. From + all other operations can be obtained. E.g.
- Boolean logic
- SUB, by adding complement
- Multiply, by repeated addition (subroutines)
- Conditional (Branch if oVerflow flag Set)
- Move data in memory
- Save and restore Program Counter (i.e. Jump)
- Stop if reach end of program
- There are too many programs to test them all. Instead we gather statistics on random samples.
- Chose set of program lengths 30 to 16777215 (longest GP ever)
- Generate 1000 programs of each length
- Run them from random start point with random input
- Program terminates if it obeys the last instruction (which must not be a jump)
- How many stop?
Controlling Run Away Programs
- Since our machine is finite, in principle it is possible to detect every infinite loop. But large memory and elapse time may be needed.
- Heuristic. Abort program if :
- any instruction is executed more >100 times.
- on re-executing any instruction, computer's state is identical to state when it was last executed.
Almost all T7 Programs Loop
Model of Random Programs
- Before any repeated instructions;
- random sequence of instructions and
- random contents of memory.
- 1 in 7 instructions is a jump to a random location
- T7 instruction set chosen to have little bias.
- I.e. every state is ≈equally likely.
- Overflow flag set half the time.
- So 50% of conditional jumps BVS are active.
- (1+0.5)/7 instructions takes program counter to a random location.
- Implies for long programs, lengths of continuous instructions (i.e. without jumps) follows a geometric distribution with mean 7/1.5=4.67
A Program Segment - Random code ending with a random jump
Forming Loops:Segments model
- Segments model assumes whole program is broken into N=L/4.67 segments of equal length of continuous instructions.
- Last instruction of each is a random jump.
- By the end of each segment, memory is re-randomised.
- Jump to any part of a segment, part of which has already been run, will
- Jump to any part of the last segment will
halt the program.
- Detailed and more accurate model in technical report
Probability of Halting
- i segments run so far. Chance next segment will
- Form first loop = i/N
- Halt program = 1/N.
- (so 1-(i+1)/N continues)
- Chance of halting immediately after segment i
- = 1/N ×
(1-2/N) (1-3/N) (1-4/N) … (1-i/N)
- = N-i (N-2)! /(N-i-1)!
- = (N-2)! N(1-N) N(N-1-i) /(N-i-1)!
- = (N-2)! N(1-N) eN Poisson(N-i-1,N)
- Total halting probability given by adding these gives
≈ sqrt(π/(2N)) = O(N-½)
Proportion of Programs without loops falls as 1/sqrt(length)
Segments model over, but gives 1/√x scaling.
Approx 1% not proved to loop forever
Number of Halting Programs rises exponentially with length
The number of halting T7 programs containing 16 million instructions exceeds
10100 000 000.
Average Run Time (non-looping)
- Mean run time given by
- Sum i weighting for segment i
∑ i ×(N-2)!N(1-N)eN Poisson(N-i-1,N)
Divided by sum of weightings (i.e. without × i)
∑ (N-2)!N(1-N)eN Poisson(N-i-1,N)
- We can bound upper part of fraction. (For long program its near one)
- Lower part gives sqrt(1/N)
- So expected run time grows as O(N½)
Run time on Terminating Programs
Run time of
non-looping programs fits
Mean run time of
all terminating programs length¾
maximum run time is limited by small, 12 bytes, memory becoming non-random.
Suppose we have a GHz T7 computer
- In 1 millisecond it executes a million instructions
- If run time of halting programs follows a geometric distribution, 99.999% of halting programs halt within 11.5×average halting time.
- Empirically average halting time = length0.75
- 99.999% of halting programs with a million instructions will execute less than 12×(106)¾ ≈ 12×3.16×104 ≈ 360000 instructions.
- I.e. 99.999% terminate in < 0.36mS or not at all.
- The halting probability falls with length to 0
- Number of terminating T7 programs rises exponentially with program size.
- Models explain inverse square root scaling
- Run time of halting programs rises slowly.
- So can be confident a randomly chosen program is looping if it does not stop almost immediately.
- Number of T7 programs grows with size.
- So almost all programs are infinitely long.
- Infinitely long program are very unlikely to stop.
With probability one, programs do not halt.