Computer Science

Paper presented at EuroGP-2006
pages 225-237
details

- By adding memory and loops to linear GP we make it Turing complete, but
- What is effect of this on fitness landscape?
- T7 computer
- Experiments and results
- Mathematical models
- Implications

- Only a few papers published on evolving Turing complete programs.
- Little work on evolving programs with side effects, such as memory
- Little work of evolving recursion or iteration
- Without memory and loops there are programs which GP cannot represent and so cannot evolve.

GP and data structures evolves GP trees with memory and includes examples of evolving loops.

- Foundations of Genetic Programming shows distribution of GP programs tends to a limit as programs gets bigger
- Also true for Turing complete programs?
- In limit, fraction of halting T7 programs is zero

- 7 instructions
- Arithmetic unit is ADD. From + all other operations can be obtained. E.g.
- Boolean logic
- SUB, by adding complement
- Multiply, by repeated addition (subroutines)

- Conditional (Branch if oVerflow flag Set)
- Move data in memory
- Save and restore Program Counter (i.e. Jump)
- Stop if reach end of program

- There are too many programs to test them all. Instead we gather statistics on random samples.
- Chose set of program lengths 30 to 16777215 (longest GP ever)
- Generate 1000 programs of each length
- Run them from random start point with random input
- Program terminates if it obeys the last instruction (which must not be a jump)
- How many stop?

- Since our machine is finite, in principle it is possible to detect every infinite loop. But large memory and elapse time may be needed.
*Heuristic*. Abort program if :- any instruction is executed more >100 times.
- on re-executing any instruction, computer's state is identical to state when it was last executed.

- Before any repeated instructions;
- random sequence of instructions and
- random contents of memory.

- 1 in 7 instructions is a jump to a random location
- T7 instruction set chosen to have little bias.
- I.e. every state is ≈equally likely.
- Overflow flag set half the time.
- So 50% of conditional jumps BVS are active.

- (1+0.5)/7 instructions takes program counter to a random location.
- Implies for long programs, lengths of continuous instructions (i.e. without jumps) follows a geometric distribution with mean 7/1.5=4.67

- Segments model assumes whole program is broken into N=L/4.67 segments of equal length of continuous instructions.
- Last instruction of each is a random jump.
- By the end of each segment, memory is re-randomised.
- Jump to any part of a segment, part of which has already been run, will form a loop.
- Jump to any part of the last segment will halt the program.
- Detailed and more accurate model in technical report CSM-445

- i segments run so far. Chance next segment will
- Form first loop = i/N
- Halt program = 1/N.
- (so 1-(i+1)/N continues)

- Chance of halting immediately after segment i
- = 1/N × (1-2/N) (1-3/N) (1-4/N) … (1-i/N)
- = N
^{-i}(N-2)! /(N-i-1)! - = (N-2)! N
^{(1-N)}N^{(N-1-i)}/(N-i-1)! - = (N-2)! N
^{(1-N)}e^{N}Poisson(N-i-1,N)

- Total halting probability given by adding these gives
≈ sqrt(π/(2N)) = O(N
^{-½})

Approx 1% not proved to loop forever

- Mean run time given by
- Sum i weighting for segment i

∑ i ×(N-2)!N(1-N)e^{N}Poisson(N-i-1,N)

Divided by sum of weightings (i.e. without × i)

∑ (N-2)!N(1-N)e^{N}Poisson(N-i-1,N)

- Sum i weighting for segment i
- We can bound upper part of fraction. (For long program its near one)
- Lower part gives sqrt(1/N)
- So expected run time grows as O(N
^{½})

Run time of non-looping programs fits Markov prediction.

Mean run time of all terminating programs length

Assuming maximum run time is limited by small, 12 bytes, memory becoming non-random.

- In 1 millisecond it executes a million instructions
- If run time of halting programs follows a geometric distribution, 99.999% of halting programs halt within 11.5×average halting time.
- Empirically average halting time = length
^{0.75} - 99.999% of halting programs with a million instructions will execute less than 12×(10
^{6})^{¾}≈ 12×3.16×10^{4}≈ 360000 instructions. - I.e. 99.999% terminate in < 0.36mS or not at all.

- The halting probability falls with length to 0
- Number of terminating T7 programs rises exponentially with program size.
- Models explain inverse square root scaling
- Run time of halting programs rises slowly.
- So can be confident a randomly chosen program is looping if it does not stop almost immediately.

- Number of T7 programs grows with size.
- So almost all programs are infinitely long.
- Infinitely long program are very unlikely to stop.
- With probability one, programs do not halt.