This directory contains papers that have been announced on the GP mailing list
but are not available at the GP ftp site at ftp.io.com.  Please look at
the ftp.io.com mirror directory to find any papers that you do not find
here.

Adil Qureshi Aug '95

icga85.txt
==========

 Cramer, Nichael Lynn: "A Representation for the Adaptive Generation of
Simple Sequential Programs", Proceedings, International Conference on
Genetic Algorithms and their Applications, July 1985[CMU], pp183-187.
		  
ABSTRACT
 
 A system for the adaptive generation sequential computer functions is
 described.  A small operator set is defined and encoded in JB, a fixed
 length representation.  After briefly examining the limitations encountered
 in this approach we explore TB, a tree-like representation of the code.
 These representations share the feature that they can be used to represent
 well-formed, useful computer programs while still being amenable to
 suitably defined genetic operators. As a working example, the system is
 used to produce two-input, single-output multiplication functions that are
 concise and well-defined. Future work, dealing with extensions to more
 complicated functions and generalizations of the techniques, is also
 discussed.

94-006.ps.gz
============

Dynamic Training Subset Selection for Supervised Learning in Genetic Programming

Chris Gathercole, Peter Ross
Department of Artificial In telligence
University of Edinburgh 80 South Bridge
Edinburgh EH1 1HN U.K.
chrisg,peterg@aisb.ed.ac.uk

February 25, 1994

Abstract

When using the Genetic Programming (GP) Algorithm on a difficult problem with
a large set of training data, a large population size is needed and a very
large number of function-tree evaluations must be carried out.
This paper describes some efforts made to reduce the number of such evaluations
by concentrating on selecting a small sub-set of the training data set on which
to actually carry out the GP algorithm. Three subset selection methods
described in the paper are: Dynamic Subset Selection (DSS), using the current
GP run to select `difficult' and/or previously , Historical Subset Selection
(HSS), selecting `diffcult' cases from previous GP runs, Random Subset
Selection (RSS) , selecting a subset of cases at random for each generation.
Various runs have shown that GP+DSS can produce better results
in less than 20\045 of the time taken by GP.  GP+HSS can nearly match the
results of GP , and, perhaps surprisingly , GP+RSS can occasionally
approach the results of GP.  GP+DSS produced better, more general results than
those reported in a paper for a variety of Neural Net works when used on
a substantial problem, known as the Thyroid problem.

94-007.ps.gz
============

Applications of genetic algorithms

Peter Ross and Dave Corne
Department of AI, University of Edinburgh
80 South Bridge, Edinburgh EH1 1HN
peter,dave@aisb.ed.ac.uk

Abstract

Genetic algorithms have been applied to a very wide range of practical
problems, often with valuable results.  This paper surveys just a few
examples, to illustrate the diversity of approaches and to point to some
of the considerations that have proved important in making applications
successful.  Because GAs provide a fairly comprehensible way to address
a wide range of difficult engineering and optimisation problems producing
good if not optimal results, it seems that the technology is finding its
way into real-world use much more easily than, say, expert systems did.

University of Edinburgh,
Department of
Artificial Intelligence,
Genetic Algorithms Research Group.

Paper No. 94-007
Submitted for a special issue of the AISB
Quarterly on Evolutionary Computation,
edited by Terry Fogarty

CS-TR-95-1542.ps.gz
===================

John Koza's report on parallelising GP

LearnModels.ps.gz
=================

Learning Mental Models

Astro Teller
Dept of Computer Science
Carnegie Mellon Univeristy
Pittsburgh,
PA 15213
astro@cs.cmu.edu

MentalModels.ps.gz
==================

Chapter 9 of AIGP

PADO-Tech-Report.ps.gz
======================

PADO: Learning Tree Structured Algorithms for Orchestration
into an Object System
Astro Teller and Manuela Veloso
February 10, 1995
School of Computer Science Carnegie
Mellon University Pittsburgh, P A 15213

Abstract

Most artificial intelligence systems work on simple problems and artificial
because they rely on the accurate sensing of the task world. Object recognition
is a crucial part of the sensing challenge and machine learning
stands in a position to catapult object in to real world domains. Given that,
to date, machine learning has not delivered general object recognition, we
propose a different point of attack: the learning architectures
We have developed a method for directly learning and combining algorithms in a
new way that imposes little burden on or bias from the humans involved. This
learning architecture, PADO, and the new results it brings to the problem of
natural image object recognition is the focus of this report.

PADO.ps.gz
==========

PADO: A New Learning Architecture for Object Recognition
Astro Teller and Manuela Veloso
Carnegie Mellon University

Abstract

Most artificial intelligence systems to day work on simple problems and
artificial domains because they rely on the accurate sensing of the
task world. Object recognition is a crucial part of the sensing challenge
and machine learning stands in a position to catapult object recognition into
real world domains. Given that, to date, machine learning has not delivered
general object recognition, we propose a different point of the learning
architectures themselves.  We have developed a method for directly learning
and combining algorithms in a new way that imposes little burden on or bias
from the humans involved. This learning architecture, PADO, and the new
results it brings to the problem of natural image object recognition
is the focus of this chapter.

Turing.ps.gz
============

Turing Completeness in the Language of Genetic Programming with Indexed Memory

Astro Teller
Carnegie Mellon University

aisb.ps.gz
==========

Building New Spatial Interaction Models Using Genetic Programming

S. Openshaw and I. Turton,
School of Geography,
University of Leeds,
Leeds,
UK

email: stan@geog.leeds.ac.uk, ian@geog.leeds.ac.uk

Abstract

Building computer models of human geographic systems is difficult because
of the poverty of applicable theory that can be used to specify well formed
and soundly based mathematicalmodels.  As a result there has been little
progress in this area since the late 1960's when mathematical model building
procedures based on entropy-maximising methods were first applied to spatial
interaction data. In the mid-1980's, attention was focused on the use of
genetic algorithms to explore the universe of different models that could be
built from the available symbolic pieces; that is there is a very very large
number of permutations of the model building blocks; viz. a mix of unary and
binary operators, unknown parameters and observed variables, and rules of
syntax for well formed equations. The so-called Automated Modeling System
(AMS) worked but never fulfilled its expectations. The paper revisits
the problem but re-casts the AMS approach into a genetic programming framework.
Some preliminary results based on Cray-YMP runs are reported.

cook94a.ps.gz
=============

Journal of Artificial Intelligence
Research 1 (1994) 231-255 Submitted 12/93; published 2/94

Substructure Discovery Using Minimum Description Length and Background 
Knowledge
Diane J. Cook cook@cse.uta.edu
Lawrence B. Holder holder@cse.uta.edu
Department of Computer Science Engineering
Box 19015 University of Texas at Arlington
Arlington, TX 76019 USA

Abstract

The ability to identify interesting and repetitive substructures
is an essential component to discovering knowledge in structural
data. We describe a new version of our Subdue substructure discovery
system based on the minimum description length principle.
The Subdue system discovers substructures that compress the original data
and represent structural concepts in the data. By replacing previously
discovered substructures in the data, multiple passes of Subdue produce a
hierarchical description of the structural regularities in the data.  Subdue
uses a computationally-bounded inexact graph match that identifies similar,
but not identical, instances of a substructure and finds an approximate
measure of closeness of two substructures when under computational constraints.
In addition to the minimum description length principle, other background
knowledge can be used by Subdue to guide the search towards more appropriate
substructures. Experiments in a variety of domains demonstrate Subdue's
ability to find substructures capable of compressing the original data and to
discover structural concepts important to the domain.

dret_postscript.tar.gz
======================

Artificial-Life Simulators and Their Applications

Howard Gutowitz 
Ecole Supérieure de Physique et de Chimie Industrielles
Laboratoire d'Electronique 
10 rue Vauquelin 
75005 Paris, France 

and 

The Santa Fe Institute 
1399 Hyde Park Road
Santa Fe, NM 87501
hag@santafe.edu 

HTML: http://alife.santafe.edu/alife/topics/simulators/dret/dret.html

Abstract:

Artificial Life (Alife) is a rapidly growing field of scientific research
linking biology and computer science. It seeks to understand how life-like
processes can be embodied in computer programs. Advances in this area
promise to illuminate fundamental questions both in biology ("What is life?")
and in Computer Science ("How to make robust and adaptable computer programs?").

Much of the work in artificial life is directed toward building computer
simulations of artificial creatures and the artificial worlds in which they
live. This report surveys major efforts in this area, with attention to
developments likely to lead to practical applications in the short to middle
term. 

ep.ps.gz
======

Genetic Programming of Music

Jeffrerey B. Putnam
New Mexico Instute of Mining and Technology
Aug 30th '94

model-of-landscapes.ps.gz
=========================

Terry Jones
February 1, 1994

Abstract

The use of the term landscape" is increasing rapidly in the field of
evolutionary computation, yet in many cases it remains poorly,
if at all, defined. This situation has perhaps developed because every one
grasps the imagery immediately, and the questions that would be asked
of a less evocative term do not get asked.  This paper presents a model
of landscapes that is general enough to encompass most of what computer
scientists would call search, though the model is not restricted
to either the field or the viewpoint.  It is particularly relevant to
algorithms that employ some form of crossover, and hence to genetic algorithms
and other members of the evolutionary computing family.  An overview of the
consequences and properties of the model establishes a connection with more
traditional search algorithms from artificial intelligence, introduces
the notion of a crossover landscape, and argues the importance of viewing
search as navigation and structure.

Santa Fe Institute,
1660 Old Pecos Trail, Suite A.,
Santa Fe, NM 87505, USA and Department of
Computer Science, University of New Mexico,
Albuquerque NM 87131, USA.
Email: terry@san tafe.edu

ppsn-94.ps.gz
=============

Program Search with a Hierarchical Variable Length Representation:
Genetic Programming, Simulated Annealing and Hill Climbing

Una-May O`Reilly and Franz Oppacher
Santa Fe Institute,
Santa Fe, NM, 87505, USA
School of Computer Science,
Carleton University, Ottawa, CANADA

Abstract.

This paper presents a comparison of Genetic Programming(GP) with Simulated
Annealing (SA) and Stochastic Iterated Hill Clim bing (SIHC) based on a suite
of program discovery problems which have been previously tackled only with GP.
All three search algorithms employ the hierarchical variable length
representation for programs brough t into recent prominence with the GP
paradigm [8]. We feel it is not intuitively obvious that mutation-based
adaptive search can handle program discovery yet, to date, for each GP problem
we have tried, SA or SIHC also work.


ppsn92.ps.gz
============

An Experimental Perspective on Genetic Programming

Una-May O`Reilly and Franz Oppacher
Intelligent Systems Research Group,
School of Computer Science,
Carleton University, Ottawa,
Ontario, K1S 5B6,
CANADA

Abstract

Genetic Programming (GP) has recently been introduced by John R. Koza [5] as
a method for genetically breeding populations of computer programs to solve
problems.  We believe GP to constitute a significant extension of the Genetic
Algorithm (GA) research paradigm primarily because it generalizes the genetic
search techniques: instead of looking for a solution to a specific instance of
a problem, GP attempts to evolvea program capable of computing the solutions
for any instance of the problem.  We have implemented a genetic programming
environment, GP*, that is capable of duplicating Koza`s experiments.
In this paper we describe a specific GP experimenton the evolution of programs
to sort vectors, and discuss the issues that must be addressed in any
application of GP: the design of fitness functions and test suites, and the
selection of program terminals and functions. Our observations point to
several previously unnoticed shortcomings of the GP approach.  We hypothesize
that these shortcomings are due to the fact that GP only uses a hierarchical
representation but does not construct its solutions in an explicitly
hierarchical manner.

oreilly
=======
Una-May O'Reilly PhD thesis (An Analysis of Genetic Programming)


rosca
=====

Directory containing GP papers by J. P. Rosca ("Learning by Adapting
Representations in GP etc.)

terry
=====

Papers by Terry Jones, including PhD thesis. (Model of Landscapes etc.)

GPdata_icga-95.ps
=================

Evolving Data Structures with Genetic Programming

W. B. Langdon	W.Langdon@cs.ucl.ac.uk
Computer Science Dept.
University College London

It is established good software engineering practice to ensure that
programs use memory via abstract data structures such as stacks,
queues and lists.  These provide an interface between the program and
memory, freeing the program of memory management details which are
left to the data structures to implement.  
The main result presented herein
is that GP can automatically generate stacks and 
queues.

Typically abstract data structures support multiple operations, such as
put and get. We show that GP can simultaneously evolve all the
operations of a data structure by implementing each such operation
with its own independent program tree. That is, the chromosome
consists of a fixed number of independent program trees. Moreover,
crossover only mixes genetic material of program trees that
implement the same operation. Program trees interact with each other
only via shared memory and shared ``Automatically Defined Functions''
(ADFs).

ADFs, ``pass by reference'' when
calling them, Pareto selection, ``good software engineering practice''
and partitioning the genetic population into ``demes''
where also investigated whilst evolving the queue in order to improve the
GP solutions. 

Code is available from this site

WBL_aaai-pppGP.ps
=================       

Pareto Optimality, Population Partitioning, Price's theorem and GP

W. B. Langdon	W.Langdon@cs.ucl.ac.uk
Computer Science Dept.
University College London


A description of a use of Pareto optimality in genetic programming is
given and an analogy with Genetic Algorithm fitness niches is drawn.
Techniques to either spread the population across many pareto optimal
fitness values or to reduce the spread are described. It is speculated
that a wide spread may not aid Genetic Programming. It is suggested
that this might give useful insight into many GPs whose fitness is
composed of several sub-objectives.

The successful use of demic populations in GP leads to speculation
that smaller evolutionary steps might aid GP in the long run.

An example is given where Price's covariance theorem helped when
designing a GP fitness function.


directed_crossover.ps
=====================
Summary of directing crossover locations in a multi-tree GP

W. B. Langdon	W.Langdon@cs.ucl.ac.uk
Computer Science Dept.
University College London

GPlist_aigp2.ps
===============

W. B. Langdon	W.Langdon@cs.ucl.ac.uk
Computer Science Dept.
University College London

Abstract for Advances in Genetic Programming 2

Much published work on genetic programming evolves functions without
``side-effects'' to learn patterns in test data, once produced they
can be used to make predictions regarding new data.  In contrast human
written programs often make extensive and explicit use of memory.
Indeed memory in some form is required for a programming system to be
Turing Complete.  In both normal and genetic programming considerable
benefits have been found in adopting a ``structured approach''. For
example, Koza has shown the introduction of evolvable code modules
(ADFs) can greatly reduce the effort required for GP to reach a
solution.
 
Teller has shown that the introduction of indexed memory into genetic
programming makes it Turing Complete.  This work builds on Teller's
and considers the introduction of data structures within GP.  The
expectation is that the use of data structures will prove as useful to
GP as code structuring has already been shown to be.  So far the
important result is that GP can evolve simple data structures, such as
stacks and queues. This abstract reports the successful evolution a
list data structure, compares the use of Pareto selection and demes
and describes the directed choice of crossover points.


surveyRN76.ps
===============

W. B. Langdon	W.Langdon@cs.ucl.ac.uk
A. Qureshi	A.Qureshi@cs.ucl.ac.uk
Computer Science Dept.
University College London

Genetic Programming -- Computers using "Natural Selection" to generate programs

RN/95/76

Abstract

Computers that ``program themselves''; science fact or fiction?
Genetic Programming uses novel optimisation techniques to ``evolve''
simple programs; mimicking the way humans construct programs by
progressively re-writing them.  Trial programs are repeatedly modified
in the search for ``better/fitter'' solutions. The underlying basis is
Genetic Algorithms (GAs).
 
Genetic Algorithms, pioneered by Holland, Goldberg and others, are
evolutionary search techniques inspired by natural selection (i.e\
survival of the fittest).  GAs work with a ``population'' of trial
solutions to a problem, frequently encoded as strings, and repeatedly
select the ``fitter'' solutions, attempting to evolve better ones. The
power of GAs is being demonstrated for an increasing range of
applications; financial, imaging, VLSI circuit layout, gas pipeline
control and production scheduling.  But one of the most intriguing
uses of GAs - driven by Koza - is automatic program generation.
 
Genetic Programming applies GAs to a ``population'' of programs -
typically encoded as tree-structures.  Trial programs are evaluated
against a ``fitness function'' and the best solutions selected for
modification and re-evaluation. This modification-evaluation cycle is
repeated until a ``correct'' program is produced.  GP has demonstrated
its potential by evolving simple programs for medical signal filters,
classifying news stories, performing optical character recognition,
and for target identification.
 
This paper surveys the exciting field of Genetic Programming.  As a
basis it reviews Genetic Algorithms and automatic program generation.
Next it introduces Genetic Programming, describing its history and
describing the technique via a worked example in C.  Then using a
taxonomy that divides GP researchs into theory/techniques and
applications, it surveys recent work from both of these perspectives.
 
Extensive bibliographies, glossaries and a resource list are included as
appendices.