node7 A SIMD Interpreter for Genetic Programming on Graphics Cards

Representing the Genetic Programming Population

Note tree is interpreted from the leafs up to the root, rather than recursively from the root to the leafs.
Therefore the tree is presented to the interpreter as a linear expression in reverse polish.
In fact quite easy to implement random tree generation (eg ramped-half-and-half), subtree crossover and various types of mutation on reverse polish expressions. (In place of Lisp, like S-expressions).
Same structure used to store population on CPU as is presented to GPU.
Avoid explicit format conversion when population is loaded onto GPU.
Reverse polish notation requires only one byte per leaf or function.
Large populations (millions of individuals) are possible.
Two main sources of waste
1. Short programs take as long to execute as long programs. I pad them with no-ops, so everyone is the same length
2. Most operations (80%) are not wanted and their results are thrown away.
Leafs access data and so are much more expensive than functions.
A multiplication takes only 4 clock cycles 4/1.35Ghz = 3nS
50% of trees are leafs, so cost is dominated by leafs not functions.
We accept other interpreter overheads, so why not SIMD overhead?