From genetic-programming-owner@list.Stanford.EDU Mon Mar 14 02:40:52 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10745
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 14 Mar 1994 02:25:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id PAA23092 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 15:50:58 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from aries.SAIC.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA24439; Fri, 11 Mar 1994 15:49:52 -0800
Received: from deneb.saic.com.saic.com by aries.saic.com (4.1/SMI-4.1)
	id AA07410; Fri, 11 Mar 94 16:47:13 MST
Date: Fri, 11 Mar 94 16:47:13 MST
From: pothiers@aries.saic.com (Steve Pothier)
Message-Id: <9403112347.AA07410@aries.saic.com>
Received: by deneb.saic.com.saic.com (4.1/SMI-4.1)
	id AA09859; Fri, 11 Mar 94 16:47:17 MST
To: order@netcom.com
Cc: jan@cs.stanford.edu, genetic-programming@cs.stanford.edu,
        dfaulkne@LightStream.COM
In-Reply-To: <199403112332.PAA19365@mail.netcom.com> (order@netcom.com)
Subject: Re: inbreeding vs exogamy <=> short term vs long term
   Date: Fri, 11 Mar 1994 15:32:43 -0800
   From: order@netcom.com (Walter Alden Tackett)
Status: RO

   > We should take care not to carry the anology too far.  I don't CARE if
   > my GP produced programs that are the end result of GP breeding have
   > "low diversity" or "little ability to survive".  This is because, for

   Yes, you care if they have "low diversity."  Fisher's fundamental
   theorem of genetics states that the ability of a population to
   produce individuals with increased fitness is proportional to the
   variance of the population fitness distribution: this means that
   not only do you want structural diversity, but also a good spread
   of good, bad, and mediocre individuals.  This applies to any
   population upon which selection and recombination are performed.
   When diversity goes to zero, so do your chances of iprovement or
   adaptation to new conditions.  If you're really certain that you
   have found the perfect solution to a problem, then this is OK- this
   can usually only be true for toy problems, however.  For a good
   treatment see "Population Genetics" by John Maynard-Smith (198?).

   -walter

Actually, I specifically had in mind a NON-toy problem.  Since the
problem is for a real world problem, I am forced to announce the
problem as "complete" at some point.  At that point a program from the
top of the fitness heap gets stuck into production code.  No more
selection or recombination is done.  I understand the value of
diversity in any interacting group (including GP SIGs :-).  My point
was just that real "production" software usually gets to a point where
you don't want GP to be part of the day-to-day running of the system.
Program breeding can be isolated much easier than can biological
systems!

Dave Faulkner made a good point in that it is useful to save a diverse
population as "seed stock" [paraphrasing heavily here] for future
(presumably similar) problems.  In fact I would love to see our
community working towards a time where we have an FTP site of
different seed populations just sitting on a virtual shelf; ready to
built upon.

-Pothier-

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Mar 14 02:26:09 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10666
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 14 Mar 1994 02:17:07 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id PAA23075 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 15:32:57 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from mail.netcom.com (netcom.netcom.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA23753; Fri, 11 Mar 1994 15:31:51 -0800
Received: from localhost by mail.netcom.com (8.6.4/SMI-4.1/Netcom)
	id PAA19365; Fri, 11 Mar 1994 15:32:43 -0800
Date: Fri, 11 Mar 1994 15:32:43 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199403112332.PAA19365@mail.netcom.com>
To: pothiers@aries.saic.com
Cc: jan@cs.stanford.edu, genetic-programming@cs.stanford.edu
In-Reply-To: <9403111642.AA02792@aries.saic.com> (pothiers@aries.saic.com)
Subject: Re: inbreeding vs exogamy <=> short term vs long term Date: Thu,
 10 Mar 94 20:17:15 PST From: zinJANthropus JANNINK <jan@CS.Stanford.EDU>
Status: RO

> We should take care not to carry the anology too far.  I don't CARE if
> my GP produced programs that are the end result of GP breeding have
> "low diversity" or "little ability to survive".  This is because, for

Yes, you care if they have "low diversity."  Fisher's fundamental theorem 
of genetics states that the ability of a population to produce individuals
with increased fitness is proportional to the variance of the population 
fitness distribution: this means that not only do you want structural
diversity, but also a good spread of good, bad, and mediocre individuals.
This applies to any population upon which selection and recombination are
performed.  When diversity goes to zero, so do your chances of iprovement
or adaptation to new conditions.  If you're really certain that you have 
found the perfect solution to a problem, then this is OK- this can usually only
be true for toy problems, however.
For a good treatment see "Population Genetics" by John Maynard-Smith (198?).

-walter

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Mar 14 02:25:50 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10684
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 14 Mar 1994 02:18:33 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id OAA22823 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 14:01:21 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from ai.iit.nrc.ca (itisgate.nrc.ca) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA19718; Fri, 11 Mar 1994 14:00:16 -0800
Message-Id: <9403112159.AA06335@ai.iit.nrc.ca>
Date: Fri, 11 Mar 94 16:59:45 EST
From: Peter Turney <peter@ai.iit.nrc.ca>
To: genetic-programming@cs.stanford.edu
Subject: Re: Stacked classifiers (was Re: Sexiness...)
Cc: peter@ai.iit.nrc.ca
Status: RO


> 
> FYI, non-GA people have induced classifiers to approach the (meta-)task of
> classifying examples which are poorly classified by another,
> previously-induced classifier.  For example:
> 
> Wolpert, D. H. "Stacked Generalization"  In Neural Networks, Vol. 5, pp
> 241-259, 1992.
> 
> Chan P.K. and Stolfo, S.J. "Toward Scalable and Parallel Inductive
> Learning:  A Case Study in Splice Junction Prediction" submitted to Machine
> Learning (special issue on ML and molecular bio) 1993?
> 
> 
> I'd like to see a GA meta-learn on top of statistical decision trees.
> 
> -Eric


This is exactly my current research project.

---------------------------------------------------------------------------
 ___    __    _____        ____    |
/_ /\  /_/|  /____/ \    /___ /|   | Peter D. Turney  (peter@ai.iit.nrc.ca)
| |\ \ | || |  __ \ /|  / ___|/    | Knowledge Systems Laboratory
| ||\ \| || | |__) |/  | | |__     | National Research Council Canada
| || \   || |  __  /\  | |/__ /|   | Ottawa, Ontario, Canada, K1A 0R6
|_|/  \__|/ |_|/ \_\/   \____|/    | (613) 993-8564  FAX: 952-7151

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 16:27:05 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA26535
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 16:10:52 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA22689 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 13:13:51 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from lightstream.LightStream.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA16237; Fri, 11 Mar 1994 13:12:46 -0800
Received: from cockatrice.LightStream.COM by lightstream.LightStream.COM (4.1/SMI-4.1)
	id AA01503; Fri, 11 Mar 94 16:12:46 EST
Received: by cockatrice.LightStream.COM (4.1/SMI-4.1)
	id AA13460; Fri, 11 Mar 94 16:12:44 EST
Message-Id: <9403112112.AA13460@cockatrice.LightStream.COM>
To: Robert Keller <keller@trurl.informatik.uni-dortmund.de>
Cc: genetic-programming@cs.stanford.edu, dfaulkne@LightStream.COM
Subject: Re: GP, P & NP 
In-Reply-To: Your message of "Fri, 11 Mar 1994 18:50:54 +0100."
             <9403111750.AA18261@trurl.informatik.uni-dortmund.de> 
Date: Fri, 11 Mar 1994 16:12:43 -0500
From: Dave Faulkner <dfaulkne@LightStream.COM>
Status: RO


A couple of interesting references on this topic might be:

ICGA89:
    De Jong, K.A. and W. M. Spears, "Using Genetic Algorithms to Solve
    NP-Complete Problems", International Conference on Genetic
    Algorithms, George Mason University, Fairfax, Virginia, June 1989,
    pgs. 124 - 132.


IJCNN90:
    Spears, W. M. and K.A. De Jong, "Using Neural Networks and Genetic
    Algorithms as Heuristics for NP-Complete Problems", International
    Joint Conference on Neural Networks, Washington D.C, January 1990, pgs.
    118 - 121.

to be found at ftp.aic.nrl.navy.mil /pub/spears.

/df/

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 15:16:41 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA24306
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 15:01:43 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id MAA22630 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 12:06:30 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from rockvax.rockefeller.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13469; Fri, 11 Mar 1994 12:05:25 -0800
Received: from darst-sgi.rockefeller.edu by rockvax.ROCKEFELLER.EDU (5.65/1.34)
	id AA02064; Fri, 11 Mar 94 15:05:22 -0500
Received: by darst-sgi.rockefeller.edu (920330.SGI/920502.SGI)
	for @rockvax.rockefeller.edu:genetic-programming@cs.stanford.edu id AA05996; Fri, 11 Mar 94 15:06:25 -0500
Date: Fri, 11 Mar 94 15:06:25 -0500
From: stebbic@darst-sgi.ROCKEFELLER.EDU (Charles Erec Stebbins)
Message-Id: <9403112006.AA05996@darst-sgi.rockefeller.edu>
To: genetic-programming@CS.Stanford.EDU
Subject: GP and diversity
Status: RO


With regard to the problem of diversity in GP, two issues stand
out in my mind:

	1)  The need to converge to a solution
	    (i.e., a limit to diversity)

	2)  The need to converge to the best (or acceptable)
	    solution (i.e., a need for diversity)

Without some limit to diversity, the set of necessary combinations
of components will not be heard in the 'noise' of the many other
diverse possibilities.  The extreme illustration would be an
infinite population with an infinite number of functions and
unlimited size to the individuals (with the control variable
of finite # generations).

On the other hand, too strong a filter to possible contructs can
remove from the population the nessessary components to create the
programs that can scale the fitness peaks.  It may be that this
is a common trap for current reproductive strategies in most
GP implimentations.  By this I mean that the very fact that
entire individuals are eliminated from generation to generation
could remove from the population the very elements needed to
reach the best solutions.

Perhaps it is the elimination of genetic material on such a 
large scale (in most GP runs, one individual is a much larger
fraction of the whole popluation than in most biological
systems - consider bacteria, which are highly adaptable organisms) 
that causes some of the problem.

A possible solution would be to allow more genetic material to
continue through the population.  But one does not want to
'saturate' the gene pool, either (issue 1 above).

A way to balance these two issues might be to allow for 
differential expression of genetic material.  This is what most
of the more complicated biological organisms do anyway.  Human 
beings have a great store of coding DNA that is

	1)  Expressed only in certain cells

	2)  Expressed only at certain times

	3)  Expressed to different degrees (not binary expression)

	4)  Expressed in relation to how other genes are expressed
	
	5)  Expressed in relation to environmental conditions


4) and 5)  allow for feedback and non-linear behavior.  If a method
could be developed to allow such a controlled preservation of
genetic material in GP, one might gain some of the robustness of
biological evolution (which, it should be remembered, is not
free from local minima either!  Such facts make films like
Jurassic Park an unfortuneate possibility ;->).   

Perhaps analogies to gene expression (within a cell) and to
multicellular differentiation (tissues) might be fruitful
starting places for an application of controlled preservation
of genetic material to GP.  The controlling mechanisms may
also, of course, be subject to evolutionary pressures.


Erec Stebbins

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 15:07:29 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA23814
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 14:41:14 -0600
Received: from Xenon.Stanford.EDU (Xenon.Stanford.EDU [36.28.0.25]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA22593 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 11:46:42 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received:  by Xenon.Stanford.EDU (5.61+IDA/25-CS-eef) id AA26182; Fri, 11 Mar 94 11:45:35 -0800
From: "Patrik D'haeseleer" <pdhaes@CS.Stanford.EDU>
Message-Id: <9403111945.AA26182@Xenon.Stanford.EDU>
Subject: Re: Diversity and sexiness in GA/GP
To: dfaulkne@LightStream.COM (Dave Faulkner)
Date: Fri, 11 Mar 1994 11:45:34 -0800 (PST)
Cc: genetic-programming@CS.Stanford.EDU
In-Reply-To: <9403111728.AA11903@cockatrice.LightStream.COM> from "Dave Faulkner" at Mar 11, 94 12:28:14 pm
X-Mailer: ELM [version 2.4 PL21]
Content-Type: text
Content-Length: 3062      
Status: RO

Dave Faulkner writes:
>It seems that we're back to this problem of diversity and how to get there
>in GP.  A number of points that I made earlier stressed the fact that
>a simple use of fitness values to obtain diversity can't work because
>you're looking at only one dimension of the fitness space (the height
>of the hill), and as such, there is no measure of structural distance
>to "bias" (nudge) the population toward a set of diverse  viable solutions.

On the other hand, most problems *do* offer you more than one dimension of
the fitness space, because they typically involve several fitness cases
(which is exactly what the "sexiness" idea was based on). 

In the paper I wrote together with Jason Bluming for Kinnear's "Advances in
Genetic Programming" book, we defined a metric for genotypical and phenotypical
distance, and genotypical and phenotypical diversity in the following way:

1) phenotypically:

Each individual didn't really have a "fixed" fitness value. Fitness for two
individuals was calculated by letting them "combat" with each other other in
an arena. However, to get a more or less absolute fitness measure to analyze
a population after a number of generations, we wrote 13 test individuals by
hand, covering as wide a range of different tactics we could think of.

Every individual in the population was then evaluated against those 13 test
programs, yielding a "behavior signature" of 13 values. Phenotypical distance
was then defined in terms of correlation between signatures, and
phenotypical diversity was defined as the *average* correlation between the
signatures of any two individuals.

In a regular GP run with multiple fitness cases, the fitness values
for each of the cases could trivially be combined in a fitness signature, and
treated in exactly the same way.

2) genotypically:

Exactly the same principle, except that we now used a "frequency signature",
consisting of the relative frequency of occurence of each of the 6 functions
and 6 terminals used in the individuals.


Of course, these metrics are far from perfect. For instance, we would want our
set of 13 test programs to be perfectly orthogonal, and to cover the entire
spectrum of possible tactics. And for the genotypical analysis, we didn't even
look at the *structures* formed with the functions and terminals.

Defective as these metrics are, they did allow us to clearly identify deme-like
structures in our population for instance. Also, we noticed that both metrics
would usually agree fairly closely on the location of those demes.


Definitely an area that could use some more investigation though. For instance,
it would be interesting to see what the correlation is between phenotypical
and genotypical distance. In other words, can we use phenotypical metrics
(such as a fitness signature of N fitness cases) to maintain genotypical
diversity?


Patrik

PS: sorry for rambling on, maybe I should get my act together and put that
paper on the ftp site some time. On the other hand, Kim's book is supposed
to appear Real Soon Now, right? ...

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 15:57:50 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA25711
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 15:42:23 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id MAA22652 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 12:42:22 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from sun2.nsfnet-relay.ac.uk by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA14865; Fri, 11 Mar 1994 12:41:17 -0800
Via: uk.ac.sunderland.consgate; Fri, 11 Mar 1994 19:39:44 +0000
Via: isis.sunderland.ac.uk (isis.sund.ac.uk); Fri, 11 Mar 1994 19:38:35 +0000
Received: by isis.sunderland.ac.uk (4.1/SMI-4.1) id AA22433;
          Fri, 11 Mar 94 19:39:43 GMT
From: cs0ral@isis.sunderland.ac.uk (r.aler)
Message-Id: <9403111939.AA22433@isis.sunderland.ac.uk>
Subject: default hierarchies
To: genetic-programming@cs.stanford.edu (genetic)
Date: Fri, 11 Mar 1994 19:39:43 +0000 (GMT)
X-Mailer: ELM [version 2.4 PL22]
Content-Type: text
Content-Length: 1541
Status: RO

 > > Hmmm, maybe you're right.  Maybe GPs just aren't a good tactic for
 > > combining subsolutions.
 > > 
 > > Meta-GP idea:
 > > 
 > > Breed a function that's pretty good at doing task A.  When this
 > > converges, if it's perfect (or good enough), use that.  If not, breed
 > > another function that's good at spotting the cases that mess up the
 > > original function.  Then, breed a third function that focuses on the
 > > pathological cases, and build this:
 > > 
 > > (if (pathological problem)
 > > 	(pathological-solution problem)
 > > 	(general-solution problem))
 > > 
 > > This could, of course, be applied recursively.  It could also be mixed
 > > with other methods -- maybe neural nets for some parts, explicitly
 > > designed rules for others.
 > > 
 > > Waddya think?
 
 	This is called a default hierarchy, I think, and were studied
 by Holland in his book. Sometimes GP produces default hierarchies if you
 put the function if-then-else in the soup. But perhaps you are right and
 GP is not good at combining subsolutions. However if GP is not good in
 that, then what is GP good for?. I think that all depends on putting
 the adequate functions in the soup (functions to recognize pathological
 problems, functions if-then-else to combine subsolutions)
 
 > with Pat Langley, it seems that one could/should build classifier systems
 > using GP as the rule formation 'engine' and play with that.  I dont know to
 
 	I am (trying) to do this to evolve control rules for a task
 planning tool (Prodigy).
 
 			Ricardo.
 
 
------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 14:29:28 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA22960
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 14:14:16 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA22527 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 11:27:37 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from lightstream.LightStream.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA11758; Fri, 11 Mar 1994 11:26:32 -0800
Received: from cockatrice.LightStream.COM by lightstream.LightStream.COM (4.1/SMI-4.1)
	id AA28040; Fri, 11 Mar 94 14:26:31 EST
Received: by cockatrice.LightStream.COM (4.1/SMI-4.1)
	id AA12546; Fri, 11 Mar 94 14:26:30 EST
Message-Id: <9403111926.AA12546@cockatrice.LightStream.COM>
To: genetic-programming@cs.stanford.edu
Cc: dfaulkne@LightStream.COM
Subject: Re: inbreeding vs exogamy <=> short term vs long term
Date: Fri, 11 Mar 1994 14:26:29 -0500
From: Dave Faulkner <dfaulkne@LightStream.COM>
Status: RO


Steve Pothier writes:

We should take care not to carry the anology too far.  I don't CARE if
my GP produced programs that are the end result of GP breeding have
"low diversity" or "little ability to survive".  This is because, for
me at least, GP produced programs are a product that I will remove
from the GP system and use in "production mode".  Those programs have
no opportunity for interaction with others in their environment
(unlike nature).

*****

What many are describing is the (potential) ability for GA/GP populations
to be a store for nearly correct, or very usable, solutions to problem
similar to the one(s) that an original run produced AND to find equally
good solutions to a particular problem in a multi-modal fitness space.

For instance, say your training function learns how to do gather food well
in one of these ant simulations with a particular food landscape.  If
you change the food landscape around a bit, will the population do as well?
Should you throw out the trained population, or start with the trained one
and continue?  Typically, without good diverse populations, the best
thing to do is start fresh and let the diversity of the random initial
population find the new peaks.  This is expensive, and there's a lot of 
learning built into the trained population that you should be able to
take advantage of, or at least you'd like to.  The problem is that without
a diverse "multi-modal" population, inbreeding of members of a population
all stuck on the same particular hill gets you no where, even when there's 
a better hill out there to climb.  Eventually you might get unstuck due
to genetic drift or mutation, but its not efficient (that's why its
usually better to start over).  So comes diversty: Yes, I've found that
big hill, and I use it in a production program, but if next week the fitness
landscape changes a bit, I'll be able to quickly re-train the population, 
and this is more likely if my population has members on ALL hills,
not just the best one.

- Dave Faulkner

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 13:45:08 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA21515
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 13:33:32 -0600
Received: from Xenon.Stanford.EDU (Xenon.Stanford.EDU [36.28.0.25]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA22417 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 10:50:14 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received:  by Xenon.Stanford.EDU (5.61+IDA/25-CS-eef) id AA20959; Fri, 11 Mar 94 10:49:05 -0800
From: "Patrik D'haeseleer" <pdhaes@CS.Stanford.EDU>
Message-Id: <9403111849.AA20959@Xenon.Stanford.EDU>
Subject: Re: Sexiness:  Did I have it backwards?
To: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Date: Fri, 11 Mar 1994 10:49:04 -0800 (PST)
Cc: hthies@willamette.edu, ekerr@willamette.edu,
        genetic-programming@CS.Stanford.EDU, levenick@willamette.edu,
        french@willamette.edu
In-Reply-To: <9403110236.AA08357@hume.CS.ORST.EDU> from "Peter Dudey" at Mar 10, 94 06:36:05 pm
X-Mailer: ELM [version 2.4 PL21]
Content-Type: text
Content-Length: 2272      
Status: RO

Peter Dudey writes:
>
>I've done some initial tests of sexiness, and the results haven't been
>very impressive.  It doesn't seem to significantly worsen things, but
>it doesn't seem to significantly improve them, either.

My take on diversity is that it not necessarily makes you arrive at a solution
*faster*. You are a lot less likely to get stuck in a local maxima though.
Spreading out the individuals in your population over a larger area of the
search space means that you have less resources available in any one region.
On the other hand, you're less likely to overlook a narrow global maxima as
well.

I'd suggest you try using your sexiness technique on a problem that tends
to get stuck in a non-optimal solution very easily (I hate calling this 
"premature convergence", because I've seen runs do this after 50-100
generations... not exactly what I'd call "premature").

Also, because you are spreading out your individuals thinner and over a larger
area, you might want to try this with a moderately large population size,
say preferably a couple of K.

I guess what I'm saying is: take a problem that's HARD enough to actually need
more diversity to solve it! A problem that still gets stuck at a local maximum
even when you run it with a pop size of a couple of K.


Another thing to keep in mind is that when crossing two very "sexy"
individuals, you're equally likely to combine their *good* characteristics
than their *bad* characteristics. From this point of view, we would expect
the average fitness of the offspring to be about the same as without
sexiness.

However, if two individuals find each other very sexy, that also means that
the individuals are probably very different *genotypically*. Crossing over
two radically different structured individuals is a lot more likely to
result in a non-viable hybrid! Therefore, although the maximum fitness 
of the offspring may be larger than without using sexiness, I would expect the
average fitness to be in fact a lot lower, which will effectively slow down
the evolution!

One way to get around this might be to impose some kind of structure on the
individuals a priori, for instance using ADF's. Haven't thought this through
completely yet, but it might be something interesting to try...


Patrik

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 13:27:19 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA20950
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 13:14:06 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA22329 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 10:22:56 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from cs.columbia.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA07554; Fri, 11 Mar 1994 10:21:52 -0800
Received: from ground.cs.columbia.edu (ground.cs.columbia.edu [128.59.1.3]) by cs.columbia.edu (8.6.4/8.6.4) with ESMTP id NAA29267; Fri, 11 Mar 1994 13:21:45 -0500
Received: from localhost (evs@localhost) by ground.cs.columbia.edu (8.6.4/8.6.4) id NAA10691; Fri, 11 Mar 1994 13:21:44 -0500
Date: Fri, 11 Mar 94 13:21:44 EST
From: Eric Siegel <evs@cs.columbia.edu>
To: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Cc: genetic-programming@cs.stanford.edu, stolfo@cs.columbia.edu
Subject: Stacked classifiers (was Re: Sexiness...)
In-Reply-To: Your message of Fri, 11 Mar 94 08:32:53 PST
Message-Id: <CMM.0.90.2.763410104.evs@ground.cs.columbia.edu>
Status: RO

FYI, non-GA people have induced classifiers to approach the (meta-)task of
classifying examples which are poorly classified by another,
previously-induced classifier.  For example:

Wolpert, D. H. "Stacked Generalization"  In Neural Networks, Vol. 5, pp
241-259, 1992.

Chan P.K. and Stolfo, S.J. "Toward Scalable and Parallel Inductive
Learning:  A Case Study in Splice Junction Prediction" submitted to Machine
Learning (special issue on ML and molecular bio) 1993?


I'd like to see a GA meta-learn on top of statistical decision trees.

-Eric

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 12:58:13 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA20078
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 12:44:03 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA22291 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 09:52:06 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from waldorf.Informatik.Uni-Dortmund.DE by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA05985; Fri, 11 Mar 1994 09:50:59 -0800
Received: from trurl.informatik.uni-dortmund.de
	by waldorf.informatik.uni-dortmund.de with SMTP (Sendmail 8.6.5/UniDo 2.0.14)
        id SAA23393; Fri, 11 Mar 1994 18:50:54 +0100
From: Robert Keller <keller@trurl.informatik.uni-dortmund.de>
Date: Fri, 11 Mar 94 18:50:54 +0100
Message-Id: <9403111750.AA18261@trurl.informatik.uni-dortmund.de>
Received: by trurl.informatik.uni-dortmund.de id AA18261; Fri, 11 Mar 94 18:50:54 +0100
To: genetic-programming@cs.stanford.edu
Subject: GP, P & NP
Cc: keller@trurl.informatik.uni-dortmund.de
Status: RO

So what about this one?

There are hundreds of known NP-complete problems. 

GP could try to evolve solving algorithms for all of them. Each 100%-correct
solution is then checked if it's from P. Got one? Bingo! GP has shown
P=NP!

We perhaps only need some CPU-millenia and definitely a lot of hope.

Go ahead & try - I'm too busy tonite for getting famous.

Seriously:

During the last couple of years, computers were successfully used in mathematics (e.g. graph theory) to support humans in the completion of long-outstanding proofs of famous conjectures. 

If P=NP holds, then the numerous vain attempts of finding a P-algorithm show that human thinking is not likely to construct such an algorithm, i.e. we tend to  construct algorithms for NP-complete problems along such paths in search space which are not likely to lead to a P-algorithm. Choosing paths which strongly differ from "normal" paths could then lead to success.
  When constructing an algorithm for an arbitrary given problem, GP tends to choose such strongly differing paths. So why not using GP for some appropriate problems of theoretic computer science in addition to using it for real-world problems?
  Even if it fails in constructing algorithms needed for certain proofs, the careful analysis of evolved algorithms may lead to new ideas how to tackle certain theoretic problems.


Robert

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 12:29:22 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA19444
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 12:22:18 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA22252 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 09:29:22 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from lightstream.LightStream.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA04720; Fri, 11 Mar 1994 09:28:17 -0800
Received: from cockatrice.LightStream.COM by lightstream.LightStream.COM (4.1/SMI-4.1)
	id AA24666; Fri, 11 Mar 94 12:28:16 EST
Received: by cockatrice.LightStream.COM (4.1/SMI-4.1)
	id AA11903; Fri, 11 Mar 94 12:28:14 EST
Message-Id: <9403111728.AA11903@cockatrice.LightStream.COM>
To: genetic-programming@CS.Stanford.EDU
Cc: dfaulkne@LightStream.COM
Subject: Re: Diversity and sexiness in GA/GP 
Date: Fri, 11 Mar 1994 12:28:14 -0500
From: Dave Faulkner <dfaulkne@LightStream.COM>
Status: RO


It seems that we're back to this problem of diversity and how to get there
in GP.  A number of points that I made earlier stressed the fact that
a simple use of fitness values to obtain diversity can't work because
you're looking at only one dimension of the fitness space (the height
of the hill), and as such, there is no measure of structural distance
to "bias" (nudge) the population toward a set of diverse  viable solutions.
Without this bias in the GA of the GP, there is no tendency toward better or
worse solutions, or less or more diverse populations (i.e., there's
no change in the probablistic mechanisms to change the characteristics 
of the algorithm). As such you're results (or lack thereof) don't surprise
me.

I don't understand the statement that "Maybe GPs just aren't a good tactic for
combining subsolutions."  Isn't this what GA/GP is all about?  What's
all this talk of hyper-planes if not the combination of sub-solutions that
are semi-viable. Also, what of building-blocks and ADF's?  I think you're
missing a fundamental point here.

I've been looking a bit for ways to encourage diversity without the use of
metrics (since, at best their complexity is O(n**2) and so are 
potentially pohibitive in large populations). Freytag's suggestion of
using metrics within a tournament selection scheme attempts to reduce
this complexity to a reasonable level, and has some merit, but there
are potentially different and better ways of looking at the problem
(beyond the crowding schemes that Goldberg et al has looked at). Examples
include "demes" and "tagging".

Two papers that I've looked at that seem interesting and applicable are:

    Spears, William M., "Simple Subpopulation Schemes", Artificial
    Intelligence Center Internal Report #AIC-93-020, Naval Research
    Laboratory, Washington, DC 20375. Submitted to Evolutionary
    Programming 1994. ftp.aic.nrl.navy.mil /pub/spears.

    Collins.  Studies in Artificial Evolution. Phd-dissertation. UCLA.
    1992.


I got both of these papers via anonymous ftp.  Collins discusses results
using demes and Spears talks about a scheme simpler than demes that he
calls "tagging".

These papers discuss GA/EA algorithmic techniques, but are probably
equally applicable to GP (after all, isn't GP more of a representation
shift than a paradigm shift?).

- Dave Faulkner

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 12:00:51 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA18048
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 11:40:26 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id IAA22160 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 08:45:59 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from aries.SAIC.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA02980; Fri, 11 Mar 1994 08:44:55 -0800
Received: from deneb.saic.com.saic.com by aries.saic.com (4.1/SMI-4.1)
	id AA02792; Fri, 11 Mar 94 09:42:52 MST
Date: Fri, 11 Mar 94 09:42:52 MST
From: pothiers@aries.saic.com (Steve Pothier)
Message-Id: <9403111642.AA02792@aries.saic.com>
Received: by deneb.saic.com.saic.com (4.1/SMI-4.1)
	id AA09480; Fri, 11 Mar 94 09:42:56 MST
To: jan@CS.Stanford.EDU
Cc: genetic-programming@CS.Stanford.EDU
In-Reply-To: <CMM.0.90.4.763359435.jan@Xenon.Stanford.EDU> (message from zinJANthropus JANNINK on Thu, 10 Mar 94 20:17:15 PST)
Subject: Re: inbreeding vs exogamy <=> short term vs long term
   Date: Thu, 10 Mar 94 20:17:15 PST
   From: zinJANthropus JANNINK <jan@CS.Stanford.EDU>
Status: RO

   A lot of the bandwidth lately has gone into the discussion of methods to
   breed better individuals.  I think we need to take a look at nature to see
   more clearly that there is an unending competition between long and short
   term strategies.

   If we consider specialization

   we see that highly specialized critters tend to have a short term advantage,
   whereas less specialized ones tend to last longer evolutionarily speaking.

   If we look at agriculture and horticulture

   we see that humans have bred highly productive crops and attractive flowers
   that have very low genetic diversity and little ability to survive on their
   own.

We should take care not to carry the anology too far.  I don't CARE if
my GP produced programs that are the end result of GP breeding have
"low diversity" or "little ability to survive".  This is because, for
me at least, GP produced programs are a product that I will remove
from the GP system and use in "production mode".  Those programs have
no opportunity for interaction with others in their environment
(unlike nature).

I do appreciate the distinction between long and short term
strategies, however.  But, in my mind, those distinctions only come
into play when considering how many generations to run the GP
breeding.   When the fitness funciton is real expensive, I need to
focus on short term strategies; when is cheaper I have the luxury of
trying long term strategies.

-Pothier-

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 12:04:07 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA18035
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 11:40:18 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id IAA22118 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 08:33:59 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA02445; Fri, 11 Mar 1994 08:32:55 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA08106; Fri, 11 Mar 94 08:32:54 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA08457; Fri, 11 Mar 94 08:32:53 PST
Date: Fri, 11 Mar 94 08:32:53 PST
Message-Id: <9403111632.AA08457@hume.CS.ORST.EDU>
To: genetic-programming@CS.Stanford.EDU
Subject: [phred@leland.Stanford.EDU: Re: Sexiness:  Did I have it backwards?..]
Status: RO

From: David Andre <phred@leland.Stanford.EDU>
Subject: Re: Sexiness:  Did I have it backwards?..
To: dudeyp@research.CS.ORST.EDU (Peter Dudey)
Date: Thu, 10 Mar 1994 23:04:04 -0800 (PST)
In-Reply-To: <9403110410.AA08383@hume.CS.ORST.EDU> from "Peter Dudey" at Mar 10, 94 08:10:27 pm
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 2203      


> 
> Hmmm, maybe you're right.  Maybe GPs just aren't a good tactic for
> combining subsolutions.
> 
> Meta-GP idea:
> 
> Breed a function that's pretty good at doing task A.  When this
> converges, if it's perfect (or good enough), use that.  If not, breed
> another function that's good at spotting the cases that mess up the
> original function.  Then, breed a third function that focuses on the
> pathological cases, and build this:
> 
> (if (pathological problem)
> 	(pathological-solution problem)
> 	(general-solution problem))
> 
> This could, of course, be applied recursively.  It could also be mixed
> with other methods -- maybe neural nets for some parts, explicitly
> designed rules for others.
> 
> Waddya think?

Exactly the kinds of things I thought about when I was doing my
OCR work -- take several mostly correct solutions and use them in 
concert somehow...I think that more work needs to be done on 
*automated* systems for using GP-derived rules in concert with 
human designed rules (and NN's, and each other).  Imagine a system 
(say with some nice knowledge base structure) that uses GP and GA and
NN's and combinations to attempt to solve problems and sub-problems.
If the system gets bogged down, bail, use the rule as a good starting place and
build onto it some structure as you have above, but *automatically*

>  > perhaps sexiness works better in classifier (like) systems that have
>  > multiple sets of 'rules' per individual...
> 
> Are you talking about Holland's classifier systems?  I need
> references!  :)

Yes...I know about it from John's class, Goldberg's GA textbook, and 
Holland's book (Induction).  Also, from some recent ideas I've talked about
with Pat Langley, it seems that one could/should build classifier systems
using GP as the rule formation 'engine' and play with that.  I dont know to
what extent this has been done.  I hope to find out some answers by 
talking to folx at the WCCI conference.

> 
> Coming up with ideas faster than I can test them,

Oh, yeah, you too?  I wish I had 7 graduate students to test ideas for me
:->.  (heck, I wish I were a grad student....) 


David Andre
Undergraduate student in AI and Learning Theory

;->

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sat Mar 12 10:57:51 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA03274
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sat, 12 Mar 1994 10:53:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id IAA23765 for <Genetic-Programming@list.stanford.edu>; Sat, 12 Mar 1994 08:05:34 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from worldlink.worldlink.com (worldlink.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA12620; Sat, 12 Mar 1994 08:04:30 -0800
Received: by worldlink.worldlink.com (5.65b/4.0.071791-Worldlink)
	id AA10517; Sat, 12 Mar 94 11:04:33 -0500
Message-Id: <2972482234.0.p00396@psilink.com>
In-Reply-To: <9403111849.AA20959@Xenon.Stanford.EDU>
Date: Fri, 11 Mar 94 10:09:58 -0500
To: "GP list" <genetic-programming@cs.stanford.edu>
From: "Andrew Singleton" <p00396@psilink.com>
Organization: Creation Mechanics
Subject: Re: Sexiness: Did I have it backwards?
X-Mailer: PSILink-DOS (3.4)
Status: RO

The issue of combining good subsolutions is an important one, but we 
may have looked at it the wrong way.  I suggest a more productive 
line of study below.


Experimental evidence shows that combining very different individuals
through crossover is a recipe for failure.  I wrote to Peter Dudley and 
suggested this might be the case, and he discovered the same
experimentally.  Studies of "deme" based massively parallel genetic 
algorithms demonstrate this principle graphically.  The boundaries 
between the demes are always marked with "lethal" failed crosses.

A biologist pointed out to me that the "hybrid vigor" is almost
exclusively a phenomenon of highly inbred domestic crops.  In robust
wild species, hybridization rarely produces a good result.

GP runs produce populations of similar trees which can cross 
successfully. This homogenization is a critical part of the GP process, 
since if you crossed random trees, your results would be very poor.  
In fitness space, the GP run smooths the space with respect to 
crossover.  This necessarily produces specialization.  In nature, this 
is called speciation.

One way to look at this is that populations evolve, rather than 
individuals.  Some very experienced GPers (Walter Tackett, Peter 
Angeline, and others) have gone on record favoring small populations.  
There is a reason for this:  Small populations have more genetic drift 
than large populations, and have the possibility of wandering off a 
local optima.

Interestingly, the biological hypothesis of "punctuated equilibrium"
relies on exactly this mechanism.  Big populations demonstrate an
equilibrium.  Small, isolated populations gain some new fitness
attributes, and then they invade the range of the original population
and wipe it out in a "punctuated" event.  Interbreeding is not an 
important mechanism.


Speciation is a big irritation in GP, since it is very difficult to 
combine good solutions from one run with good solutions in another run. 
This brings us back to the original question:

How do we combine two partial solutions to make a better solution? 

The answer to this question is in many ways the holy grail of GP.  If 
we can anwer it, we will have the answer to the efficiency problem - We 
can start with pre-evolved libraries, rather than running everything 
from scratch - and the all important hierarchy question - How 
do we combine parts (subroutines) to make a more complex whole?

We know one thing.  The answer is NOT subtree crossover.

Some possibilities have been proposed:
1) A classifier type cooperating set of GP individuals;
2) Evolving or explicitly constructing a hierarchy of pre-evolved 
individuals, based on parcelling out fitness cases.
As was pointed out, the bidding process for a classifier system is just 
a way of evolving this case by case hierarchy, so these solutions are 
very similar.

There are other mechanisms.  I have been working on my own solution for 
this problem, and I look forward to hearing more suggestions.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 07:10:54 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10232
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 07:06:45 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id EAA21947 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Mar 1994 04:09:01 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from gwusun.seas.gwu.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA24843; Fri, 11 Mar 1994 04:07:56 -0800
Received: by gwusun.seas.gwu.edu id AA23559
  (5.65c/IDA-1.4.4 for genetic-programming@cs.stanford.edu); Fri, 11 Mar 1994 07:07:45 -0500
From: Richard Freytag <freytag@seas.gwu.edu>
Message-Id: <199403111207.AA23559@gwusun.seas.gwu.edu>
Subject: Re: Diversity and sexiness in GA/GP
To: genetic-programming@cs.stanford.edu
Date: Fri, 11 Mar 1994 07:07:45 -0500 (EST)
Cc: gary@enws320.eas.asu.edu (Kevin Gary), GAVINM@delphi.com (Gavin Mclachlan),
        72262.3067@CompuServe.COM (Richard A. Freytag),
        70300.115@CompuServe.COM (Paul Sparks),
        hull@eos.arc.nasa.gov (Kent Hull), mgrant@sun.com (Michael Grant),
        chatham@isi.edu (Ralph Chatham)
X-Mailer: ELM [version 2.4 PL21]
Content-Type: text
Content-Length: 1557      
Status: RO

Forwarded message:
>From freytag Sun Mar  6 16:19:30 1994
From: Richard Freytag <freytag@seas.gwu.edu>
Message-Id: <199403062119.AA28284@gwusun.seas.gwu.edu>
Subject: Re: Diversity and sexiness in GA/GP
To: freytag@seas.gwu.edu (Richard Freytag)
Date: Sun, 6 Mar 1994 16:19:27 -0500 (EST)
X-Mailer: ELM [version 2.4 PL21]
Content-Type: text
Content-Length: 1188      

> 
> Just to add in my $0.02...
> 
> What if we were to combine the sexiness idea with the order-2
> tournaments idea (sorry, forgot the reference):
> 
> For each paring, conduct n sets of m [normal] fitness tournaments,
> resuling in n winners.  Compute the distance* between each pair (n
> taken 2 at a time).  The pair with the largest distance get to fool around.
> 
> Every winner is, by definition, of high fitness.  Large distance insures
> that each of the pair has something the other could use (opposites
> attract), and encourages (or at least capitalizes) diversity.
> 
The problem with this reasoning is that I can quickly think of several
cases where this is absolutely not the case.  For example when trying
to optimize the bi-modal function X^2+K with a finite domain of size
2K (has the effect of shifting the a domain centered around 0 a
distance of K in the positive direction).  Under traditional GAs the
strings at the two optima at 0 and 2K will interfere destructively
reducing performace dramatically.  

I suspect the same may be found with GP-oriented examples.  Looks like
Goldberg's area of work.  GP extensions might be useful.

Best regards,
Richard Freytag

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar 10 23:56:14 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA16019
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 10 Mar 1994 23:46:35 -0600
Received: from Xenon.Stanford.EDU (Xenon.Stanford.EDU [36.28.0.25]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA21535 for <Genetic-Programming@list.stanford.edu>; Thu, 10 Mar 1994 20:18:47 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received:  by Xenon.Stanford.EDU (5.61+IDA/25-CS-eef) id AA00145; Thu, 10 Mar 94 20:17:16 -0800
Date: Thu, 10 Mar 94 20:17:15 PST
From: zinJANthropus JANNINK <jan@cs.stanford.edu>
To: genetic-programming@cs.stanford.edu
Subject: inbreeding vs exogamy <=> short term vs long term
Message-Id: <CMM.0.90.4.763359435.jan@Xenon.Stanford.EDU>
Status: RO

A lot of the bandwidth lately has gone into the discussion of methods to
breed better individuals.  I think we need to take a look at nature to see
more clearly that there is an unending competition between long and short
term strategies.

If we consider specialization

we see that highly specialized critters tend to have a short term advantage,
whereas less specialized ones tend to last longer evolutionarily speaking.

If we look at agriculture and horticulture

we see that humans have bred highly productive crops and attractive flowers
that have very low genetic diversity and little ability to survive on their
own.

We can also see that most pure bred dogs and cats are prone to unusual
diseases and defects, whereas this is less often true for mixed breeds.


Turning to GA/GP we find that most work has been geared exclusively towards
short term results.  I think that many of the problems we have with premature
conversion are related to an incomplete appreciation of the way evolution
works.

To me it seems clear that greedy algorithms are tuned towards the individual
not the population, and that overuse of short term strategy actually hurts
a species.  On the other hand, long term strategy necessarily implies
sluggish results.


I propose that we start some serious discussion of how to implement bi-modal
reproduction strategies in GP, which I think should lead to continued
progress in the field.  I look forward to your comments.

Jan Jannink -- no, I don't have a fancy .sig

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar 10 23:42:08 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15731
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 10 Mar 1994 23:28:22 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA21523 for <Genetic-Programming@list.stanford.edu>; Thu, 10 Mar 1994 20:12:14 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15571; Thu, 10 Mar 1994 20:10:30 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA01979; Thu, 10 Mar 94 20:10:28 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA08383; Thu, 10 Mar 94 20:10:27 PST
Date: Thu, 10 Mar 94 20:10:27 PST
Message-Id: <9403110410.AA08383@hume.CS.ORST.EDU>
To: phred@leland.Stanford.EDU, levenick@willamette.edu, french@willamette.edu,
        hthies@willamette.edu, ekerr@willamette.edu, jtilton@willamette.edu,
        dudeyp@chert.CS.ORST.EDU
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <199403110359.TAA11961@elaine32.Stanford.EDU> (message from David Andre on Thu, 10 Mar 1994 19:59:43 -0800 (PST))
Subject: Re: Sexiness:  Did I have it backwards?..
Status: RO

 > From: David Andre <phred@leland.Stanford.EDU>
 > Date: Thu, 10 Mar 1994 19:59:43 -0800 (PST)
 > 
 > I think that this is a wash -- in order to really know 
 > who one program *should* mate with, you need to know 
 > 1) what they are good at, 2) if they are 'perfect' on those things
 > that they are good at, and 3) where to do crossover so as to salvage
 > the things they are good at.  GP 'rule-sets' are not like 
 > individual rules

Hmmm, maybe you're right.  Maybe GPs just aren't a good tactic for
combining subsolutions.

Meta-GP idea:

Breed a function that's pretty good at doing task A.  When this
converges, if it's perfect (or good enough), use that.  If not, breed
another function that's good at spotting the cases that mess up the
original function.  Then, breed a third function that focuses on the
pathological cases, and build this:

(if (pathological problem)
	(pathological-solution problem)
	(general-solution problem))

This could, of course, be applied recursively.  It could also be mixed
with other methods -- maybe neural nets for some parts, explicitly
designed rules for others.

Waddya think?

 > perhaps sexiness works better in classifier (like) systems that have
 > multiple sets of 'rules' per individual...

Are you talking about Holland's classifier systems?  I need
references!  :)

Coming up with ideas faster than I can test them,
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\       I'm in favor of gun control, but it doesn't have much to do with      /
/   crime.  The vast majority of handgun deaths are suicides and accidents.   \
\                                                                             /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar 11 01:10:58 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA27288
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Mar 1994 01:03:41 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA21490 for <Genetic-Programming@list.stanford.edu>; Thu, 10 Mar 1994 20:00:57 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from elaine32.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15384; Thu, 10 Mar 1994 19:59:53 -0800
Received: from localhost (phred@localhost) by elaine32.Stanford.EDU (8.6.4/8.6.4) id TAA11961; Thu, 10 Mar 1994 19:59:46 -0800
From: David Andre <phred@leland.Stanford.EDU>
Message-Id: <199403110359.TAA11961@elaine32.Stanford.EDU>
Subject: Re: Sexiness:  Did I have it backwards?..
To: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Date: Thu, 10 Mar 1994 19:59:43 -0800 (PST)
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <9403110236.AA08357@hume.CS.ORST.EDU> from "Peter Dudey" at Mar 10, 94 06:36:05 pm
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 448       
Status: RO

I think that this is a wash -- in order to really know 
who one program *should* mate with, you need to know 
1) what they are good at, 2) if they are 'perfect' on those things
that they are good at, and 3) where to do crossover so as to salvage
the things they are good at.  GP 'rule-sets' are not like 
individual rules

perhaps sexiness works better in classifier (like) systems that have
multiple sets of 'rules' per individual...

David Andre

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar 10 21:56:31 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA13704
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 10 Mar 1994 21:49:43 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id SAA21400 for <Genetic-Programming@list.stanford.edu>; Thu, 10 Mar 1994 18:37:24 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13663; Thu, 10 Mar 1994 18:36:19 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA01629; Thu, 10 Mar 94 18:36:06 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA08357; Thu, 10 Mar 94 18:36:05 PST
Date: Thu, 10 Mar 94 18:36:05 PST
Message-Id: <9403110236.AA08357@hume.CS.ORST.EDU>
To: hthies@willamette.edu, ekerr@willamette.edu,
        genetic-programming@cs.stanford.edu, levenick@willamette.edu,
        french@willamette.edu
Subject: Sexiness:  Did I have it backwards?
Status: RO

I've done some initial tests of sexiness, and the results haven't been
very impressive.  It doesn't seem to significantly worsen things, but
it doesn't seem to significantly improve them, either.

I ran the idea across Paul Cull, who may be the sole GA/GP afficionado
here, and who specializes in "mathematical biology".  "The idea," I
explained, "is that you want to breed with someone who's good at the
things you're not good at."

He thought this was a terrrible idea, and suggested that one wants to
breed with others that are GOOD at the same things.

I'm going to give this a shot.  What do you all think?

I'm still on the lookout for some good problems where traditional
GA/GP converges too soon, and sub-solutions might meaningfully be
combined.

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\       I'm in favor of gun control, but it doesn't have much to do with      /
/   crime.  The vast majority of handgun deaths are suicides and accidents.   \
\                                                                             /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar 10 06:41:17 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA14782
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 10 Mar 1994 06:37:13 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id DAA19775 for <Genetic-Programming@list.stanford.edu>; Thu, 10 Mar 1994 03:37:52 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from news.std.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA07494; Thu, 10 Mar 1994 03:36:48 -0800
Received: from world.std.com by news.std.com (5.65c/Spike-2.1)
	id AA10339; Thu, 10 Mar 1994 06:36:47 -0500
Received: by world.std.com (5.65c/Spike-2.0)
	id AA01529; Thu, 10 Mar 1994 06:36:46 -0500
Date: Thu, 10 Mar 1994 06:36:45 -0500 (EST)
From: Gilbert Syswerda <syswerda@optimax.com>
Subject: Borland C++ 4.0
To: genetic-programming@cs.stanford.edu
Message-Id: <Pine.3.89.9403100654.A837-0100000@world.std.com>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: RO

In the last few days, someone here mentioned doing GP using Borland C++ 
4.0 and DPMI.

Who ever you are, could you message me? I'd like to ask a couple of 
questions about the mechanics of using 4.0 and DPMI.

--Gil

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar 10 02:55:32 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA12323
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 10 Mar 1994 02:46:21 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id XAA19660 for <Genetic-Programming@list.stanford.edu>; Wed, 9 Mar 1994 23:46:44 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from fcitds.fcit.monash.edu.au (nashi.fcit.monash.edu.au) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA01148; Wed, 9 Mar 1994 23:45:39 -0800
Received: by fcitds.fcit.monash.edu.au (5.65/DEC-Ultrix/4.3)
	id AA06651; Thu, 10 Mar 1994 17:45:10 +1000
Date: Thu, 10 Mar 1994 17:45:10 +1000
From: simonr@apple.fcit.monash.edu.au (Simon Raik)
Message-Id: <9403100745.AA06651@fcitds.fcit.monash.edu.au>
To: bankoskj@cs.rpi.edu, genetic-programming@cs.stanford.edu
Subject: Re:  GP in prolog
Status: RO

James & GPers,

>I remember seeing a reference to a GP system done in Prolog.  Where I
>might I go to find it?  I am very interested in using it for a
>potential application...

I wrote a simple GP system as an assignment last year. I would be
happy to give you, and anyone else who is interested, the code. It
is a GP engine and auxilliary functions for the Sante-Fe trail problem
written in LP Prolog 3.10.

Please mail me if you want to see the code or have any other queries.

Simon.

////////////////////////////////////////////////////////////////////////
                            Simon Raik
                  simonr@nellads.cc.monash.edu.au 
                     Monash University Australia.
////////////////////////////////////////////////////////////////////////

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Mar  9 15:55:46 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA06347
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 9 Mar 1994 15:40:26 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA18457 for <Genetic-Programming@list.stanford.edu>; Wed, 9 Mar 1994 09:20:14 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from cs.rpi.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA26408; Wed, 9 Mar 1994 09:19:10 -0800
Received: from williams.cs.rpi.edu by cs.rpi.edu (5.67a/1.4-RPI-CS-Dept)
	id AA27793; Wed, 9 Mar 1994 12:19:08 -0500 (bankoskj from williams.cs.rpi.edu)
Date: Wed, 9 Mar 94 12:19:06 EST
From: bankoskj@cs.rpi.edu
Received: by williams.cs.rpi.edu (4.1/2.2-RPI-CS-client)
	id AA15016; Wed, 9 Mar 94 12:19:06 EST
Message-Id: <9403091719.AA15016@williams.cs.rpi.edu>
To: genetic-programming@cs.stanford.edu
Subject: GP in prolog
Status: RO


I remember seeing a reference to a GP system done in Prolog.  Where I
might I go to find it?  I am very interested in using it for a
potential application...


Thanks,

James Bankoski
CS Department R.P.I. 

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Mar  7 22:04:04 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA23546
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 7 Mar 1994 21:56:03 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id TAA16082 for <Genetic-Programming@list.stanford.edu>; Mon, 7 Mar 1994 19:11:41 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from Aphid.Stanford.EDU (CS.Stanford.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA18897; Mon, 7 Mar 1994 19:10:37 -0800
Received: from dorite.use.com by Aphid.Stanford.EDU with SMTP (5.61+IDA/25-eef) id AA04048; Mon, 7 Mar 94 19:10:36 -0800
Received: from [192.207.21.206] by dorite.use.com with smtp
	(Smail3.1.28.1 #13) id m0pdsBC-0001FfC; Mon, 7 Mar 94 22:09 EST
Message-Id: <m0pdsBC-0001FfC@dorite.use.com>
Date: Mon, 7 Mar 94 22:09 EST
X-Sender: bpillow@dorite.use.com
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
To: genetic-programming@cs.stanford.edu
From: bpillow@pillowsoft.com (Brad Pillow)
Subject: More on Memory management for GP systems
Status: RO

For what it's worth, I use a Macintosh for all my GP work (there don't seem
to be too many Mac GPer's).  I use my own framework which is a smatterring
of GPQuick (thanks Andy), my own stuff, and whatever other great ideas I
can borrow from all the other free systems that are out there (thanks
everyone).

It is implemented in Symantec C++ and uses the linearized epxression tree
ala GPQuick/Mike Keith/etc.  Since I am using 16 bit nodes, and typically
expressions with max 256 nodes, I can get pops of 20000 with no prob on my
mac (26 meg installed).  For better or worse, I don't have to worry about
page swapping since the Mac doesn't do that yet.  All in all, it makes for
a pretty fast system.  I really don't manage memory since I statically
allocate.  Next week is the intro of the PowerPC (which I will have as soon
as they ship), which will bring a 4-8x floating point improvement.  I have
found floating pt performance to be significant.  In the GP classes I have,
I can create integer or fp individuals.  On something like the Santa Fe
trail, I get a huge improvement using integer .vs. fp (maybe a factor of
4).

--brad

__________________________________________________________________________
Brad Pillow                  | Internet: bpillow@pillowsoft.com
PillowSoft Inc.              | AOL: pillowsoft
11715 Fox Rd., Suite 400-120 | Compuserve: 71470,2751
Indianapolis, IN 46236       | Phone: 317-823-8756, Fax: 317-823-5988

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Mar  7 18:04:09 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA16342
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 7 Mar 1994 17:51:53 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id OAA15528 for <Genetic-Programming@list.stanford.edu>; Mon, 7 Mar 1994 14:46:15 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from mail.netcom.com (netcom7.netcom.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA08226; Mon, 7 Mar 1994 14:45:12 -0800
Received: from localhost by mail.netcom.com (8.6.4/SMI-4.1/Netcom)
	id OAA19697; Mon, 7 Mar 1994 14:46:03 -0800
Date: Mon, 7 Mar 1994 14:46:03 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199403072246.OAA19697@mail.netcom.com>
To: Rice@HPP.Stanford.EDU
Cc: smaxwell@wpo.borland.com, dudeyp@chert.CS.ORST.EDU,
        genetic-programming@cs.stanford.edu
In-Reply-To: <2972065232-9533916@KSL-EXP-35> (message from James Rice on 07 Mar 1994 13:40:32 -0800 (PST))
Subject: Re: Memory management for GP systems [Oops, long]
Status: RO

> EGC will clean up behind you.  If you're using some sort
> of implementation that doesn't have a GC then you'd have
> to keep a free list of tree node cells.  You might end up
> finding that this is enough extra bother that it's better
> to swap to a vectorised representation, anyway.

Actually, recursively freeing a tree in C or Pascal is easy.  The
total algorithm in SGPC, which is about as S as you can get ;-) ,
requires about 27 lines of C code including headers and declarations.
I think the truly clever programmer (e.g., see the article by Keith &
Martin in Kinnear's forthcoming collection) can do it in a lot less.
Needless to say, not using recursion makes the algorithm and the PBSL
(Programmer Brain Stress Level) much larger, in probability.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Mar  7 18:04:02 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA13816
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 7 Mar 1994 16:31:23 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA15356 for <Genetic-Programming@list.stanford.edu>; Mon, 7 Mar 1994 13:43:14 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from HPP.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA04581; Mon, 7 Mar 1994 13:42:10 -0800
Received: from KSL-EXP-35 (KSL-EXP-35.Stanford.EDU) by HPP.Stanford.EDU (4.1/inc-1.0)
	id AA03424; Mon, 7 Mar 94 13:40:41 PST
Message-Id: <2972065232-9533916@KSL-EXP-35>
Sender: RICE@KSL-EXP-35.Stanford.EDU
Date: Mon, 7 Mar 94  13:40:32 PST
From: James Rice <Rice@HPP.Stanford.EDU>
To: smaxwell@wpo.borland.com, dudeyp@chert.CS.ORST.EDU
Cc: genetic-programming@cs.stanford.edu
Subject: Re: Memory management for GP systems [Oops, long]
In-Reply-To: <sd7b0c02.060@wpo.borland.com>
Status: RO

It's a simple misconception to assume that a generational
GP implementation needs to use twice as much memory as a
S/S implementation because of creating the new population
before you've thrown away the old one.  This is actually
not true, but is an easy mistake to make because that's
the easiest way to do it (and it's the way that it's done
in the lisp code from The Book, which has spawned to many
workalikes).  What you need to do is as follows (or
obviously something isomorphic):

Before you do any any crossover etc., run the roulette
wheel (or whatever) the relevant number of times in order
to decide up-front on all of the genetic operations that
you are going to perform and the parents that are going to
be used for these operations.  Every time you select an
individual as a parent increment a reference count (all
reference counts are zeroised before you start any of
this).  After you've selected all of the genetic
operations you end up with a population that has high
reference counts for the highly fit individuals and zero
for the individuals which were (generally) unfit and hence
were not selected at all.  Any individual that has a
reference count of zero can immediately be overwritten by
offspring from the genetic operations and so such
individuals constitute a free memory pool to be used for
the next generation.  All you then need to do is loop
through the list of genetic operations performing them.
You select memory for the offspring individual(s) from the
free pool provided by the individuals with zero reference
counts.  As you perform the genetic operation you
decrement the reference counts of the parents, and when
they hit zero you add them to the offspring free list
(actually, a stack is better).

[This is really just a hand implementation of a
reference-counting GC.  Bring back InterLisp, all is
forgiven.]

If you do this you will find that the consumption of the
parents always keeps pace with the available memory for
the offspring.  The only way that this can fail (that I
know of) is if every member of the population gets
selected exactly once.  In this case you'd need two free
slots in your free list to start with [actually a number
equal to the max(number of offspring from genetic
operations)].  However, this will never actually happen
until some time in the future when all of the monkeys have
successfully typed up the complete works of Shakespear
(including sonnets), so it's not exactly a problem to
worry about, even though the solution is trivial.

Of course, there's still the problem of how you reclaim
the memory for the individuals that are freed up.  This is
trivial if you are using a vectorised representation
(which is the most memory efficient representation
anyway).  If you're using a tree based representation then
(if you're using Lisp) you can traverse the trees and snip
out the CDRs of all of the cons cells that make up the
trees (starting at the leaves, of course), after this the
EGC will clean up behind you.  If you're using some sort
of implementation that doesn't have a GC then you'd have
to keep a free list of tree node cells.  You might end up
finding that this is enough extra bother that it's better
to swap to a vectorised representation, anyway.


Rice - if you're being clever, you'll end up sorting the
vector that contains the population (i.e., sorting the
pointers to the data structures that represent the
individuals) according to the order in which they were
originally allocated (you might just have a separate array
that points to them that doesn't get reordered by all of
this allocating and deallocating).  This will result in
you looping over the poopulation for fitness computation
in a manner that uses memory contiguously and hence
reduces paging overhead.  If you're really disc bound you
might want to sort the individuals by fitness before you
do your genetic operations, but this involves moving the
contents of the individual data structures, not the
pointers to the data structures so that you still end up
with the contiguous memory allocation memory mentioned
above, but the _contents_ of that memory is now sorted by
fitness.  This will pay because higher fitness individuals
are preferentially selected.

*** All Un/Subscribe messages should go to      ***
*** genetic-programming-REQUEST@cs.stanford.edu ***
***                    ^^^^^^^^                 ***

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Mar  7 14:29:01 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09246
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 7 Mar 1994 14:14:21 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA15075 for <Genetic-Programming@list.stanford.edu>; Mon, 7 Mar 1994 11:16:32 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from wpo.borland.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA27646; Mon, 7 Mar 1994 11:15:26 -0800
Received: from Borland-Message_Server by wpo.borland.com
	with WordPerfect_Office; Mon, 07 Mar 1994 11:12:02 -0800
Message-Id: <sd7b0c02.060@wpo.borland.com>
X-Mailer: WordPerfect Office 4.0
Date: Mon, 07 Mar 1994 10:43:46 -0800
From: smaxwell@wpo.borland.com
To: dudeyp@chert.CS.ORST.EDU
Cc: genetic-programming@cs.stanford.edu
Subject:  Memory management for GP systems -Reply
Status: RO

> I've just hacked up a GP system for my sexiness tests (which are
> being run even as I write).  Doing this, I am reminded that GP takes up
> outrageous amounts of memory.

> What is the standard response to this -- use virtual memory, or get
> ahold of a "real" machine?

I've been doing all my GP work on a PC (yeah, well.  I work at Borland,
fer crying out loud :-).  With 16-bit code, I can deal with populations of
500.  In order to work with larger populations, I've gone to 32-bit code
running with 32-bit DPMI (Borland C++ V4.0), and can experiment with
populations on the order of 20K.

Of course, since I'm also using my coroutine model, which is similar in
memory behavior to Steady State, I use perhaps half the memory of a
generational model (don't have to keep old and new populations
around)....

-+- Sid

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Mar  6 14:19:13 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA21567
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 6 Mar 1994 14:11:49 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA13565 for <Genetic-Programming@list.stanford.edu>; Sun, 6 Mar 1994 11:32:52 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA27526; Sun, 6 Mar 1994 11:31:45 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA14564; Sun, 6 Mar 94 11:31:42 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA06005; Sun, 6 Mar 94 11:31:44 PST
Date: Sun, 6 Mar 94 11:31:44 PST
Message-Id: <9403061931.AA06005@hume.CS.ORST.EDU>
To: genetic-programming@cs.stanford.edu
Subject: I need some pathological cases to test sexiness
Status: RO

I've got my code up and running, and I need some pathological cases
for testing / showing off.  These should be problems where:

-Vanilla GP tends to get stuck on a local extreme, and
-Sub-solutions might meaningfully be combined by crossover.

The problem I'm testing on now is recognizing pairs of on bits in a
string of five bits.  The initial tests didn't show much difference
between the vanilla and the sexy (what a great name for a
minseries...), but I'm going to try cranking down the maximum critter
size.

Thanks,
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\  "It is possible, by extremely perverse manipulation of the package system, /
/ to cause the sequence of letters NIL to be recognized not as the symbol that\
\ represents the empty list but as another symbol with the same name." -CLtL2 /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Mar  6 14:04:28 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA21263
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 6 Mar 1994 13:55:59 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA13558 for <Genetic-Programming@list.stanford.edu>; Sun, 6 Mar 1994 11:16:10 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA27332; Sun, 6 Mar 1994 11:15:05 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA14522; Sun, 6 Mar 94 11:15:02 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA05996; Sun, 6 Mar 94 11:15:03 PST
Date: Sun, 6 Mar 94 11:15:03 PST
Message-Id: <9403061915.AA05996@hume.CS.ORST.EDU>
To: tesler@taurus.apple.com
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <199403061703.JAA09417@taurus.apple.com> (tesler@taurus.apple.com)
Subject: Re: Memory management for GP systems
Status: RO

 > Date: Sun, 6 Mar 1994 09:04:26 -0700
 > From: tesler@taurus.apple.com
 > 
 > At  8:13 PM 3/5/94 -0800, Peter Dudey wrote:
 > >I've just hacked up a GP system for my sexiness tests (which are being
 > >run even as I write).  Doing this, I am reminded that GP takes up
 > >outrageous amounts of memory.
 > 
 > How many individuals exist at a time and what is their average size, in
 > some unit such as tree nodes?  How much memory does your GP system use (per
 > node)?

Both the size of the population and the size of the critters are user
options.  The amount of memory will probably depend on the Lisp under
which the code is run.

 > >What is the standard response to this -- use virtual memory, or get
 > >ahold of a "real" machine?
 > 
 > What hardware have you got?

A SPARCstation1, but I don't have exclusive use of it.

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\  "It is possible, by extremely perverse manipulation of the package system, /
/ to cause the sequence of letters NIL to be recognized not as the symbol that\
\ represents the empty list but as another symbol with the same name." -CLtL2 /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Mar  6 14:05:53 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA21246
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 6 Mar 1994 13:54:52 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA13552 for <Genetic-Programming@list.stanford.edu>; Sun, 6 Mar 1994 11:12:06 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA27277; Sun, 6 Mar 1994 11:11:00 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA14509; Sun, 6 Mar 94 11:10:59 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA05993; Sun, 6 Mar 94 11:10:58 PST
Date: Sun, 6 Mar 94 11:10:58 PST
Message-Id: <9403061910.AA05993@hume.CS.ORST.EDU>
To: genetic-programming@cs.stanford.edu
Subject: Small, public domain, CLOS-based GP code -- here it is!
Status: RO

;;;; Genetic Programming
;;;
;;; Based on John Koza's _Genetic_Programming_, 1992, MIT Press.
;;;
;;; Implemented in Common LISP, using CLOS
;;;
;;; by Peter Dudey
;;; dudeyp@research.cs.orst.edu
;;;
;;; Started:  1 March 1994
;;; Finished:  5 March 1994
;;;
;;; To use, evaluate
;;; (evolve (make-instance 'population))
;;;
;;; You can set many of the slots listed in the definition of population, e.g.:
;;; (evolve (make-instance 'population :size 500 :max-generations 30))
;;;
;;; If you want to do something truly funky, you should define a subclass
;;; of population, as in the accompanying files.
;;;
;;; WARNING:  Fitness values are kept in copied individuals.  This will need
;;; to be overridden if the fitness function is nondeterministic!
;;;


;;; COMPILER DIRECTIVES
;;;
;;; You might want to override this while debugging your code...
;;;

(eval-when (compile)
	   (proclaim '(optimize speed space (safety 0))))


;;; MACROS AND NON-GENERIC FUNCTIONS
;;;
;;; These are mostly to make later code more legible.
;;;

(defmacro arity (nonterminal)
  "Returns the number of arguments a nonterminal takes."
  `(second ,nonterminal))

(defmacro flat-list-p (list)
  "Returns T iff LIST is a list of atoms."
  `(not (find-if #'consp ,list)))

(defun depth (form)
  "Returns the depth of a form, e.g., 0 for atoms, 2 for alists."
  (cond
   ((atom form)
    0)
   (T
    (1+ (apply #'max (mapcar #'depth form))))))


;;; CLASS DEFINITIONS

(defclass critter ()
  ((code					; LISP code for the critter
    :initarg :code
    :accessor code
    :type list)
   (branch-count-tree				; Branch-count-tree for code
    :initarg :branch-count-tree			;  (+ (* 1 (+ 0 1)) 1) is
    :accessor branch-count-tree			;  (7 (5 1 (3 1 1)) 1)
    :type list					;  (i.e., car = 1 + sum cdr)
    :initform ())
   (raw-fitness					; Raw fitness, e.g., error
    :accessor raw-fitness
    :initform NIL)
   (normalized-fitness				; (worst-raw - my-raw)
    :accessor normalized-fitness
    :type number))
  (:documentation
   "An individual in the population."))

(defclass population ()
  ((size					; Number of critters
    :initarg :size
    :accessor size
    :type integer
    :initform 200)
   (terminals					; Constants and parameters
    :initarg :terminals
    :accessor terminals
    :type list
    :initform '(0 1))
   (nonterminals				; Functions
    :initarg :nonterminals
    :accessor nonterminals
    :type list
    :initform '((* 2) (+ 2) (- 2)))		; Form is (function arity)
   (max-depth					; Maximum critter tree depth
    :initarg :max-depth				;  New critters are 1/2 this
    :accessor max-depth				;  Mutant branches 1/3
    :type integer
    :initform 10)
   (critter-type				; Class of individuals
    :initarg :critter-type
    :accessor critter-type
    :type symbol
    :initform 'critter)
   (critters					; Members of the population
    :initarg :critters
    :accessor critters
    :type list
    :initform ())
   (fitness-test				; Maps a critter to raw fitness
    :initarg :fitness-test
    :accessor fitness-test
    :type function
    :initform #'(lambda (x)			; Here, raw fitness is how far
		  (abs (- 17 (eval (code x)))))) ; a critter is from 17
   (best-raw-fitness				; Best raw fitness
    :accessor best-raw-fitness
    :type number)
   (average-raw-fitness				; Average raw fitness
    :accessor average-raw-fitness
    :type number)
   (worst-raw-fitness				; Worst raw fitness
    :accessor worst-raw-fitness
    :type number)
   (normalized-fitness-sum			; Sum of normalized fitness
    :accessor normalized-fitness-sum		;  for roulette wheel
    :type number)
   (generation					; # of current generation
    :accessor generation
    :type integer
    :initform 0)
   (max-generations
    :initarg :max-generations
    :accessor max-generations
    :type integer
    :initform 10)
   (crossover-permil				; Critters per 1000 from
    :initarg :crossover-permil			;  crossover
    :accessor crossover-permil
    :type integer
    :initform 150)
   (mutation-permil				; Critters per 1000 from
    :initarg :mutation-permil			;  mutation
    :accessor mutation-permil
    :type integer
    :initform 1))
  (:documentation
   "A population of individuals."))


;;; GENERIC FUNCTIONS

(defgeneric make-code (terminals nonterminals max-depth)
  (:documentation
   "Returns an s-expression with a depth of MAX-DEPTH whose leaves are
TERMINALS and whose internal nodes are NONTERMINALS."))

(defgeneric initialize-gene-pool (population)
  (:documentation
   "Fills POPULATION with the appropriate type of critter and evaluates
them."))

(defgeneric display (object &optional indent stream verbose)
  (:documentation
   "Displays OBJECT in a useful manner."))

(defgeneric done (population)
  (:documentation
   "Returns T iff POPULATION's termination criterion is met."))

(defgeneric evaluate (object &optional fitness-test)
  (:documentation
   "Performs fitness evaluation on OBJECT."))

(defgeneric breed (population)
  (:documentation
   "Produces the next generation through crossover and mutation."))

(defgeneric choose (population &optional chooser)
  (:documentation
   "Selects a critter from POPULATION probabilistically, based on fitness.  The
CHOOSER argument will be used for sexiness."))

(defgeneric copy (object)
  (:documentation
   "Produces a copy of OBJECT, which can safely be destructively modified."))

(defgeneric count-branches (object)
  (:documentation
   "Sets OBJECT's branch-count-tree if it is not already set.  Saving this
information makes crossover faster."))

(defgeneric total-branches (object)
  (:documentation
   "A faster equivalent of (car (count-branches object))."))

(defgeneric sub-tree (object
		      &optional
		      branch-count-tree
		      branch-number
		      accessor-so-far)
  (:documentation
   "Returns a form which evaluates to a random sub-tree of OBJECT (or OBJECT's
code, if it's a critter).  This is a valid setf form."))

(defgeneric cross (mom dad max-depth)
  (:documentation
   "Produces a critter which is a cross between MOM and DAD.  These names are
not meant to imply and gender division;  there is none."))

(defgeneric crossover (population)
  (:documentation
   "Crosses two individuals in POPULATION, and returns the result."))

(defgeneric mutate (critter terminals nonterminals max-depth)
  (:documentation
   "Produces a critter which is a mutation of CRITTER."))

(defgeneric evolve (population)
  (:documentation
   "Runs POPULATION until (done population) returns T.  See p. 76 of Koza."))


;;; METHODS

(defmethod initialize-instance :after ((self population) &key)
  (format T "Generating initial gene pool...")
  (initialize-gene-pool self)
  (format T "Done~%"))

(defmethod initialize-gene-pool ((population population))
  (with-slots
   (size terminals nonterminals max-depth critters)
   population
   ;; Create a bunch of critters
   (setf critters NIL)
   (dotimes (i size)
	    (push (make-instance (critter-type population)
				 :code
				 (make-code terminals
					    nonterminals
					    (floor max-depth 2)))
		  critters))
   (evaluate population)))			; Evaluate the initial critters

(defmethod make-code ((terminals list)
		      (nonterminals list)
		      (max-depth integer))
  (cond
   ;; If max-depth is 0, return a random terminal
   ((= 0 max-depth)
    (nth (random (length terminals)) terminals))
   ;; Otherwise, pick a nonterminal and recursively build its children
   (T
    (let ((nonterminal (nth (random (length nonterminals)) nonterminals))
	  (children NIL))
      ;; Make the children recursively
      (dotimes (i (arity nonterminal))
	       (push (make-code terminals nonterminals (1- max-depth))
		     children))
      ;; Cons the function name onto the front, and return it
      (cons (car nonterminal) children)))))

(defmethod display ((object T)
		    &optional
		    (indent 0)
		    (stream T)
		    (verbose T))
  (declare (ignore verbose indent))
  (format stream "~a~%" object))

(defmethod display ((object population)
		    &optional
		    (indent 0)
		    (stream T)
		    (verbose T))
  (format stream "~a  Generation: ~a of ~a~%"
	  object (generation object) (max-generations object))
  (format stream "Best raw fitness: ~a~%" (best-raw-fitness object))
  (format stream "Average raw fitness: ~a~%" (average-raw-fitness object))
  (when verbose
	(format stream "Size: ~a~%" (size object))
	(format stream "Terminals: ~a~%" (terminals object))
	(format stream "Nonterminals: ~a~%" (nonterminals object))
	(format stream "Best critter:~%")
	(display (first (critters object)) (+ 3 indent) stream T)))

(defmethod display ((object critter)
		    &optional
		    (indent 0)
		    (stream T)
		    (verbose T))
  (format stream "~vt~a  Raw fitness: ~a  Normalized fitness: ~a~%"
	  indent object (raw-fitness object) (normalized-fitness object))
  (when verbose
	(display (code object) indent stream T)
	(format stream "~%")))

(defmethod done ((population population))
  (= (generation population)
     (max-generations population)))

(defmethod evaluate ((object critter) &optional (fitness-test NIL))
  (unless (raw-fitness object)
	  (setf (raw-fitness object)
		(funcall fitness-test object))))

(defmethod evaluate ((object population)
		     &optional
		     (fitness-test (fitness-test object)))
  (with-slots
   (critters best-raw-fitness worst-raw-fitness
	     average-raw-fitness normalized-fitness-sum)
   object
   (dolist (critter critters)
	   (evaluate critter fitness-test))
   ;; Sort the critters, with good (low) raw fitness first
   (setf critters
	 (sort critters #'< :key #'raw-fitness))
   ;; Compute statistics
   (setf best-raw-fitness
	 (raw-fitness (first critters)))
   (setf worst-raw-fitness
	 (raw-fitness (car (last critters))))
   (setf average-raw-fitness
	 (float (/ (apply #'+ (mapcar #'raw-fitness critters))
		   (length critters))))
   ;; Compute normalized fitness
   (setf normalized-fitness-sum 0)
   (dolist (critter critters)
	   (incf normalized-fitness-sum
		 (setf (normalized-fitness critter)
		       (- worst-raw-fitness (raw-fitness critter)))))))

(defmethod breed ((population population))
  (with-slots
   (critters crossover-permil mutation-permil terminals nonterminals max-depth)
   population
   (let ((next-generation NIL))
     (do ((i 0 (1+ i))
	  (random-permil			; Permil = per 1000
	   (random 1000)
	   (random 1000)))
	 ((= i (size population))		; When there are enough,
	  (setf critters next-generation))	;  make these the critters
	 (cond
	  ;; Copy an individual
	  ((>= random-permil (+ crossover-permil mutation-permil))
	   (push (choose population)
		 next-generation))
	  ;; Perform crossover
	  ((>= random-permil mutation-permil)
	   (push (crossover population)
		 next-generation))
	  ;; Perform mutation
	  (T
	   (push (mutate (choose population)
			 terminals
			 nonterminals
			 max-depth)
		 next-generation)))))))

(defmethod choose ((population population) &optional (chooser NIL))
  (declare (ignore chooser))
  (cond
   ;; If all of the critters are equally fit, choose one at random
   ((= 0 (normalized-fitness-sum population))
    (nth (random (length (critters population)))
	 (critters population)))
   ;; Otherwise, spin the roulette wheel...
   (T
    (do* ((critter-list (critters population) (rest critter-list))
	  (n
	   (- (random (normalized-fitness-sum population))
	      (normalized-fitness (car critter-list)))
	   (- n (normalized-fitness (car critter-list)))))
	 ((< n 0) (car critter-list))))))

(defmethod copy ((object critter))
  (make-instance (class-of object)		; It might be a descendant
		 :code (copy-tree (code object))
		 :branch-count-tree (copy-tree (branch-count-tree object))))

(defmethod count-branches ((object T))
  (cond
   ((atom object)
    1)
   ;; If the car is a number, this is a branch-count-tree.  Return it
   ((numberp (car object))
    object)
   (T
    (let ((sub-trees (mapcar #'count-branches (rest object))))
      (cons (+ 1 (apply #'+ (mapcar #'total-branches sub-trees)))
	    sub-trees)))))

(defmethod count-branches ((object critter))
  (cond
   ;; If it's already been computed, return it
   ((branch-count-tree object)
    (branch-count-tree object))
   ;; Otherwise, compute, set, and return it
   (T
    (setf (branch-count-tree object)
	  (count-branches (code object))))))

(defmethod total-branches ((object T))
  (cond
   ((atom object)
    1)
   ((numberp (car object))
    (car object))
   (T
    (count-branches object))))

(defmethod total-branches ((object critter))
  (total-branches (branch-count-tree object)))

(defmethod sub-tree ((object T)
		     &optional
		     (branch-count-tree NIL)
		     (branch-number 0)
		     (accessor-so-far NIL))
  (cond
   ;; If BRANCH-NUMBER is 0, return  ACCESSOR-SO-FAR
   ((= 0 branch-number)
    accessor-so-far)
   ;; Otherwise, recurse on the appropriate sub-tree
   (T
    (do ((n 1 (1+ n))
	 (number
	  (1- branch-number)
	  (- number (total-branches (car sub-counts))))
	 (sub-trees (rest object) (rest sub-trees))
	 (sub-counts (rest branch-count-tree) (rest sub-counts)))
	((< number (total-branches (car sub-counts)))
	 (sub-tree (car sub-trees)
		   (car sub-counts)
		   number
		   `(nth ,n ,accessor-so-far)))))))

(defmethod sub-tree ((object critter)
		     &optional
		     (branch-count-tree (count-branches object))
		     (branch-number
		      (random (total-branches (count-branches object))))
		     (accessor-so-far `(code ,object)))
  (sub-tree
   (code object)
   branch-count-tree
   branch-number
   accessor-so-far))

(defmethod cross ((mom critter) (dad critter) (max-depth integer))
  (do ((kid NIL))
      ((and kid (<= (depth (code kid)) max-depth)) ; When it's small enough,
       kid)					;  return the kid
      (setq kid (copy mom))
      (eval `(setf ,(sub-tree kid)
		   ,(sub-tree dad)))
      (setf (raw-fitness kid) NIL)		; Values from mom are
      (setf (branch-count-tree kid) NIL)))	;  no longer valid

(defmethod crossover ((population population))
  (cross (choose population)
	 (choose population)
	 (max-depth population)))

(defmethod mutate ((critter critter)
		   (terminals list)
		   (nonterminals list)
		   (max-depth integer))
  (do ((mutant NIL))
      ((and mutant (<= (depth (code mutant)) max-depth)) ; When not too big,
       mutant)					;  return the mutant
      (setq mutant (copy critter))
      (eval `(setf ,(sub-tree mutant)
		   ,(make-code terminals
			       nonterminals
			       (floor max-depth 3))))
      (setf (raw-fitness mutant) NIL)		; Values from parent are
      (setf (branch-count-tree mutant) NIL)))	;  no longer valid

(defmethod evolve ((population population))
  (do ()
      ((done population))
      (display population 0 T NIL)
      (breed population)
      (evaluate population)
      (incf (generation population)))
  (display population 0 T T)			; Display POPULATION verbosely
  population)					;  and return it

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Mar  6 00:15:28 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA26587
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 6 Mar 1994 00:06:21 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id VAA13118 for <Genetic-Programming@list.stanford.edu>; Sat, 5 Mar 1994 21:18:02 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA16616; Sat, 5 Mar 1994 21:16:58 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA09109; Sat, 5 Mar 94 21:16:53 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA05863; Sat, 5 Mar 94 21:16:52 PST
Date: Sat, 5 Mar 94 21:16:52 PST
Message-Id: <9403060516.AA05863@hume.CS.ORST.EDU>
To: p00396@psilink.com
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <2972011253.1.p00396@psilink.com>
Subject: 500-line GP system (Common Lisp w/ CLOS)
Status: RO

I've put together what I think is a rather nice GP system that runs
under Common Lisp with CLOS (the Common Lisp Object System).  Who
wants to put it in their archive?

It's fairly small (under 500 lines) -- would it be appropriate to post
it here?

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\  "It is possible, by extremely perverse manipulation of the package system, /
/ to cause the sequence of letters NIL to be recognized not as the symbol that\
\ represents the empty list but as another symbol with the same name." -CLtL2 /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sat Mar  5 23:15:21 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA23141
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sat, 5 Mar 1994 23:11:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA13096 for <Genetic-Programming@list.stanford.edu>; Sat, 5 Mar 1994 20:14:19 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15635; Sat, 5 Mar 1994 20:13:15 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA08809; Sat, 5 Mar 94 20:13:13 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA05842; Sat, 5 Mar 94 20:13:11 PST
Date: Sat, 5 Mar 94 20:13:11 PST
Message-Id: <9403060413.AA05842@hume.CS.ORST.EDU>
To: genetic-programming@cs.stanford.edu
Subject: Memory management for GP systems
Status: RO

I've just hacked up a GP system for my sexiness tests (which are being
run even as I write).  Doing this, I am reminded that GP takes up
outrageous amounts of memory.

What is the standard response to this -- use virtual memory, or get
ahold of a "real" machine?

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\  "It is possible, by extremely perverse manipulation of the package system, /
/ to cause the sequence of letters NIL to be recognized not as the symbol that\
\ represents the empty list but as another symbol with the same name." -CLtL2 /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sat Mar  5 23:32:23 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA23259
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sat, 5 Mar 1994 23:18:37 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA13093 for <Genetic-Programming@list.stanford.edu>; Sat, 5 Mar 1994 20:01:43 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received:  by Sunburn.Stanford.EDU (5.67b/25-SUNBURN-eef) id AA15406; Sat, 5 Mar 1994 20:00:39 -0800
Date: Sat, 5 Mar 94 20:00:39 PST
From: John Koza <koza@cs.stanford.edu>
To: genetic-programming@cs.stanford.edu
Subject: Last Call for GP Papers for Annotated Bibliography
Message-Id: <CMM.0.90.4.762926439.koza@Sunburn.Stanford.EDU>
Status: RO

Genetic Programming II typesetting will be finished this week.
I have an annotated bibliography of genetic programming papers in this book.
So far I've located 88 GP papers.   I've listed 65 of them below. The other
23 are the accepted papers for the June 26-July 2, 1994 World Congress on
Computational Intelligence to be held this summer in Florida.

If you have written (or know about) a published paper or
about-to-be-published paper on GP, I'd like to include it. Any conference
paper, journal article, tech report, thesis, chapter in edited book is OK. If
it's for a conference this summer, it must be already ACCEPTED (not just
submitted), but it need not be published yet to be included.

Since time is short, I need, by e-mail an abstract of article, author names,
title, exact title of publication, edito of publication (e.g., if it's a
conferencee proceeding with an editor), name of publisher.  A copy of full
article should be sent by mail as well.

John Koza
Box K
Los Altos, CA 94023 USA

I will place the entire bibliography into the GP archive shortly (along with
the 23 IEEE papers that were accepted - but not officially announced as of
today).

------------


Abbott, R. J. 1991.  Niches as a GA divide-and-
conquer strategy.  In Chapman, Art and Myers, 
Leonard (editors).  Proceedings of the Second Annual 
AI Symposium for the California State University.  
California State University. 

Altenberg, L. 1994.  The evolution of evolvability in 
genetic programming.  In Kinnear, K. E. Jr. 
(editor). Advances in Genetic Programming.  The 
MIT Press.  

Andre, D. 1994a.  Automatically defined features: 
The simultaneous evolution of 2-dimensional 
feature detectors and an algorithm for using them.  
In Kinnear, K. E. Jr. (editor). Advances in Genetic 
Programming.  The MIT Press.

Andrews, M. and Prager, R. 1994.  Genetic 
programming for the acquisition of double auction 
market strategies.  In Kinnear, K. E. Jr. (editor). 
Advances in Genetic Programming.  The MIT Press.  

Angeline, P. J. 1994a.  Evolutionary Algorithms and 
Emergent Intelligence. Ph.D. dissertation.  
Computer Science Department.  The Ohio State 
University.

Angeline, P. J. 1994b.  Genetic programming and the 
emergence of intelligence.  In Kinnear, K. E. Jr. 
(editor). Advances in Genetic Programming.  The 
MIT Press. 

Angeline, P. J. and Pollack, J. B. 1992.  The 
evolutionary induction of subroutines.  Proceedings 
of the Fourteenth Annual Conference of the Cognitive 
Science Society. Lawrence Earlbaum.  

Angeline, P. J. and Pollack, J. B. 1993a. Coevolving 
High-Level Representations.  Technical report 92-PA-
COEVOLVE.  Laboratory for Artificial 
Intelligence.  The Ohio State University.  July 1993.  

Angeline, P. J. and Pollack, J. B. 1993b.  Competitive 
environments evolve better solutions for complex 
tasks.  In Forrest, Stephanie (editor). Proceedings of 
the Fifth International Conference on Genetic 
Algorithms.  Morgan Kaufmann. 

Angeline, P. J. and Pollack, J. B. 1994. Coevolving 
high-level representations.   In Langton, C. G. 
(editor).  Artificial Life III, SFI Studies in the Sciences 
of Complexity. Volume XVII. Addison-Wesley. 

Atkin, M. and Cohen, P. R. 1993a.  Genetic 
programming to learn an agent's monitoring 
strategy.  Proceedings of the AAAI-93 Workshop on 
Learning Action Models.  AAAI Press.    

Atkin, M. and Cohen, P. R. 1993b.  Genetic 
programming to learn an agent's monitoring 
strategy.  Technical report TR-93-26, Computer 
Science Department, University of Massachusetts, 
Amherst.

Banzhaf, Wolfgang. 1993.  Genetic programming for 
pedestrians. In Forrest, S. (editor).  Proceedings of 
the Fifth International Conference on Genetic 
Algorithms.  Morgan Kaufmann.  

D'haeseleer, P. and Bluming, J. 1994.  Effects of 
locality in individual and population evolution.  In 
Kinnear, K. E. Jr. (editor). Advances in Genetic 
Programming.  The MIT Press.  

Gruau, F. 1992a.  Genetic synthesis of Boolean 
neural networks with a cell rewriting 
developmental process.  In Schaffer, J. D. and 
Whitley, D. (editors).  Proceedings of the Workshop 
on Combinations of Genetic Algorithms and Neural 
Networks 1992.  The IEEE Computer Society Press.

Gruau, F. 1992b.  Cellular encoding of Genetic Neural 
Networks.  Technical report 92-21.  Laboratoire de 
l'Informatique du Parallelisme.  Ecole Normale 
Superieure de Lyon.  

Gruau, F. 1993a.  Genetic synthesis of modular 
neural networks.  In Forrest, S. (editor).  
Proceedings of the Fifth International Conference on 
Genetic Algorithms.  Morgan Kaufmann.  

Gruau, F. 1993b.  Grammatical inference with 
genetic search using cellular encoding.  In Lucas, 
Simon (editor). Proceedings of the International 
Conference on Grammatical Inference.  The Institution 
of Electrical Engineers, London.  

Gruau, F. 1994a.  Neural Network Synthesis using 
Cellular Encoding and the Genetic Algorithm.  PhD 
thesis.  Laboratoire de l'Informatique du 
Parallelisme, Ecole Normale Superieure de Lyon. 
Lyon, France.  

Gruau, F. 1994b.  Genetic micro programming of 
neural networks.  In Kinnear, K. E. Jr. (editor). 
Advances in Genetic Programming.  The MIT Press. 

Gruau, F and Whitley, D. 1993a.  The cellular 
development of neural networks: The interaction 
of learning and evolution.  Technical report 93-04.  
Laboratoire de l'Informatique du Parallelisme, 
Ecole Normale Superieure de Lyon. Lyon, France.  

Gruau, F and Whitley, D.  1993b. Adding learning to 
the cellular development process: a comparative 
study. Evolutionary Computation.  1(3):213P233. 

Handley, S. 1993a.  Automated learning of a 
detector for a-helices in protein sequences via 
genetic programming.  In Forrest, S. (editor).  
Proceedings of the Fifth International Conference on 
Genetic Algorithms.  Morgan Kaufmann. 

Handley, S. 1993b.  The genetic planner: The 
automatic generation of plans for a mobile robot 
via genetic programming.  Proceedings of the Eighth 
IEEE International Symposium on Intelligent Control.  
The IEEE Control System Society.  

Handley, S. 1993c.  The automatic generation of 
plans for a mobile robot via genetic programming 
with automatically defined functions.  Proceedings 
of the Fifth Workshop on Neural Networks: An 
International Conference on Computational 
Intelligence: Neural Networks, Fuzzy Systems, 
Evolutionary Programming, and Virtual Reality.  The 
Society for Computer Simulation.  

Handley, S. 1994a. The automatic generation of 
plans for a mobile robot via genetic programming 
with automatically defined functions.  In Kinnear, 
K. E. Jr. (editor). Advances in Genetic Programming.  
The MIT Press.  

Iba, H., de Garis, H., and Higuchi, T. 1993.  
Evolutionary learning of predatory  behaviors 
based on structured classifiers.  In Meyer, J. A., 
Roitblat, H. L. and Wilson, S. W. (editors). From 
Animals to Animats 2: Proceedings of the Second 
International Conference on Simulation of Adaptive 
Behavior.  The MIT Press.  

Iba, H., deGaris, H., and Sato, T. 1993.  Solving 
identification problems by structured genetic 
algorithms.  Technical report  ETL-TR-93-17.  Japan 
Electrotechnical Laboratory. 

Iba, H., deGaris, H., and Sato, T. 1994. Genetic 
programming using a minimum description 
length principle.  In Kinnear, K. E. Jr. (editor). 
Advances in Genetic Programming.  The MIT Press. 

Iba, H., Kurita, T., de Garis, H., and Sato, T. 1993.  
System identification using structured genetic 
algorithms.  In Forrest, S. (editor).  Proceedings of 
the Fifth International Conference on Genetic 
Algorithms.  Morgan Kaufmann.  

Iba, H. and Sato, T.  1992.  Meta-level strategy 
learning for GA based on structured 
representation.  In Proceedings of the Second Pacific 
Rim International Conference on Artificial Intelligence.  
Center for Artificial Intelligence Research, Kaist.  

Iba, H. and Sato, T.  1994.  Extension of 
STROGANOFF for symbolic problems.  Technical 
report  ETL-TR-94-1.  Japan Electrotechnical 
Laboratory. 

Jannink, Jan. 1994.  Cracking and co-evolving 
randomizers.  In Kinnear, K. E. Jr. (editor). 
Advances in Genetic Programming.  The MIT Press.

Jiang, M. 1992.  A hierarchical genetic system for 
symbolic function identification.  Master's thesis.  
University of Montana.  

Jiang, M. 1993.  An adaptive function identification 
system.  Proceedings of the IEEE/ACM Conference on 
Developing and Managing Intelligent System Projects, 
Vienna, Virginia, March 1993.  

Jiang, M. and Wright, A. H. 1992. A hierarchical 
genetic system for symbolic function 
identification.  Proceedings of the 24th Symposium on 
the Interface: Computing Science and Statistics, College 
Station, Texas, March 1992.  

Keith, M. J. and Martin, M. C.  Genetic 
programming in C++: Implementation issues.  In 
Kinnear, K. E. Jr. (editor). Advances in Genetic 
Programming.  The MIT Press.  

Kinnear, K. E., Jr. 1993a. Evolving a sort: Lessons in 
genetic programming.  1993 IEEE International 
Conference on Neural Networks, San Francisco. IEEE 
Press.  Volume 2. 

Kinnear, K. E., Jr. 1993b.  Generality and difficulty in 
genetic programming: Evolving a sort.  In Forrest, 
S. (editor).  Proceedings of the Fifth International 
Conference on Genetic Algorithms.  Morgan 
Kaufmann. 

Kinnear, K. E. ,Jr. (editor). 1994a. Advances in Genetic 
Programming.  Cambridge: The MIT Press. 

Kinnear, K. E., Jr.  1994b.  Alternatives in automatic 
function definition: A comparison of performance.  
In Kinnear, K. E. ,Jr. (editor). Advances in Genetic 
Programming.  Cambridge: The MIT Press. 

Massand, B. 1994.  Optimizing confidence of text 
classification by evolution of symbolic 
expressions.  In Kinnear, K. E. Jr. (editor). Advances 
in Genetic Programming.  The MIT Press.

Nguyen, T. and Huang, T.  1994. Evolvable 
modeling: Structural adaptation through 
hierarchical evolution for 3-D model-based vision.  
In Kinnear, K. E. Jr. (editor). Advances in Genetic 
Programming.  The MIT Press.

Nordin, P. 1994.  A compiling genetic programming 
system that directly manipulates the machine 
code.  In Kinnear, K. E. Jr. (editor). Advances in 
Genetic Programming.  The MIT Press. 

Oakley, E. H. N. 1994.  Two scientific applications of 
genetic programming: Stack filters and non-linear 
equation fitting to chaotic data.  In Kinnear, K. E. 
Jr. (editor). Advances in Genetic Programming.  The 
MIT Press.

O'Reilly, U. M. and Oppacher, F.  1992.  An 
experimental perspective on genetic 
programming.  In Maenner, R. and Manderick, B. 
(editors). Proceedings of the Second International 
Conference on Parallel Problem Solving from Nature.  
North Holland.  

Reynolds, C. W. 1993.  An evolved vision-based 
behavioral model of coordinated group motion.  In 
Meyer, Jean-Arcady, Roitblat, H. L. and Wilson, S. 
W. (editors). From Animals to Animats 2: Proceedings 
of the Second International Conference on Simulation 
of Adaptive Behavior.  The MIT Press.  

Reynolds, C. W.  1994a.  Evolution of obstacle 
avoidance behavior: Using noise to promote 
robust solutions.  In Kinnear, K. E. Jr. (editor). 
Advances in Genetic Programming.  The MIT Press. 

Reynolds, C. W. 1994b.  An evolved vision-based 
model of obstacle avoidance behavior.  In 
Langton, C. G. (editor). Artificial Life III, SFI Studies 
in the Sciences of Complexity. Volume XVII. 
Addison-Wesley.  

Ryan, C. 1994.  Pygmies and civil servants.  In 
Kinnear, K. E. Jr. (editor). Advances in Genetic 
Programming.  The MIT Press. 

Siegel, E. 1994.  Competitively evolving decision 
trees against fixed training cases for natural 
language processing.  In Kinnear, K. E. Jr. (editor). 
Advances in Genetic Programming.  The MIT Press.

Sims, K.  1991a.  Artificial evolution for Computer 
Graphics.  Computer Graphics. 25(4): 319P328. July 
1991.  

Sims, K. 1991b. Panspermia.  In Langton, Christopher 
G. (editor). Artificial Life II Video Proceedings.  
Addison-Wesley.  

Sims, K. 1992a. Interactive evolution of dynamical 
systems.  In Varela, F. J., and Bourgine, P. 
(editors). Toward a Practice of Autonomous Systems: 
Proceedings of the First European Conference on 
Artificial Life.  The MIT Press.  

Sims, K. 1992b.  Interactive evolution of equations 
for procedural models. Proceedings of IMAGINA 
conference, Monte Carlo, January 29-31, 1992.  

Sims, K. 1993a.  Interactive evolution of equations 
for procedural models.  The Visual Computer.  
9:466-476. 

Sims, K. 1993b.  Evolving Images. Lecture presented 
at Centre George Pompidou, Paris on March 4, 
1993.  Notebook.  Number 5.  

Spencer, G. 1993.  Automatic generation of 
programs for crawling and walking.  In Forrest, S. 
(editor).  Proceedings of the Fifth International 
Conference on Genetic Algorithms.  Morgan 
Kaufmann.  

Spencer, G. 1994. Automatic generation of programs 
for crawling and walking.  In Kinnear, K. E. Jr. 
(editor). Advances in Genetic Programming.  The 
MIT Press.

Tackett, W. A. 1993a. Genetic programming for 
feature discovery and image discrimination.  In 
Forrest, S. (editor).  Proceedings of the Fifth 
International Conference on Genetic Algorithms.  
Morgan Kaufmann.  

Tackett, W. A. 1993b.  Genetic generation of 
dendritic trees for image classification.  Proceedings 
of the World Conference on Neural Networks, Portland, 
Oregon, July 1993.  IEEE Press. 

Tackett, W. A. and Carmi, A. 1994a.  Scalability, 
generalization, and breeding schemes in genetic 
programming: The donut problem.  In Kinnear, K. 
E. Jr. (editor). Advances in Genetic Programming.  
The MIT Press.

Teller, A. 1993.  Learning mental models. Proceedings 
of the Fifth Workshop on Neural Networks: An 
International Conference on Computational 
Intelligence: Neural Networks, Fuzzy Systems, 
Evolutionary Programming, and Virtual Reality.  The 
Society for Computer Simulation. 

Teller, A. 1994a. The evolution of mental models.  In 
Kinnear, K. E. Jr. (editor). Advances in Genetic 
Programming.  The MIT Press.

Teller, Astro. 1994b.  Genetic programming, indexed 
memory, the halting problem, and other 
curiosities.  Proceedings of the Seventh Florida 
Artificial Intelligence Research Symposium.  

Thonemann, U. W. 1992.  Verbesserung des Simulated 
Annealing unter Anwendung Genetischer 
Programmierung am Beispiel des Diskreten 
Quadratischen Layoutproblems.  Master's thesis, 
University of Paderborn, Germany.  

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sat Mar  5 23:31:30 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA23304
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sat, 5 Mar 1994 23:23:34 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA13106 for <Genetic-Programming@list.stanford.edu>; Sat, 5 Mar 1994 20:35:09 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from worldlink.worldlink.com (worldlink.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA16085; Sat, 5 Mar 1994 20:34:05 -0800
Received: by worldlink.worldlink.com (5.65b/4.0.071791-Worldlink)
	id AA08854; Sat, 5 Mar 94 23:33:09 -0500
Message-Id: <2972011253.1.p00396@psilink.com>
Date: Sat, 05 Mar 94 22:43:02 -0500
To: "GP list" <genetic-programming@cs.stanford.edu>
From: "Andrew Singleton" <p00396@psilink.com>
Organization: Creation Mechanics
Subject: GPQUICK fix
X-Mailer: PSILink-DOS (3.4)
Status: RO

There is a small bug in the last distributed version of GPQUICK.

In file CHROME.CPP, function GetTarget(), approximately line 938,

          } else if (pop[winner]->nfitness->IsBetter(pop[target]->nfitness))

Should be:

          } else if (pop[target]->nfitness->IsBetter(pop[winner]->nfitness))

This will improve your performance for KillTournament sizes greater than 1
(the default is 2).

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar  4 00:31:01 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA00227
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 4 Mar 1994 00:25:43 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id VAA11002 for <Genetic-Programming@list.stanford.edu>; Thu, 3 Mar 1994 21:46:07 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from latcs1.lat.OZ.AU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15981; Thu, 3 Mar 1994 21:45:02 -0800
Received: from latcs4.lat.OZ.AU by latcs1.lat.oz.au (5.67b/1.34)
	id AA01926; Fri, 4 Mar 1994 16:44:50 +1100
Message-Id: <199403040544.AA21763@latcs4.lat.oz.au>
From: barton@latcs1.lat.oz.au (Douglas P. Barton [Sun Dragon])
Date: Fri, 4 Mar 1994 16:44:57 -0500
X-Mailer: Mail User's Shell (7.2.3 5/22/91)
To: Genetic Programming List <genetic-programming@CS.STANFORD.EDU>
Subject: varying genome length
Status: RO

Hi GPers,

I was just wondering whether there is much interest in EA systems that allow
individuals to have varying genome lengths.  This would, I suppose, lead to the
notion of species and one owuld have to consider the implications of species
interaction etc...

anyway, if anyone has any thoughts, I here :)


-- 
Douglas Barton - barton@latcs1.oz.au
                 C/- Menzies College, La Trobe University,
                 Bundoora 3083, Australia.
                 AH: +61 3 479 2869     BH: +61 3 479 1326
And your fortune for today --

			*** NEWSFLASH ***
Russian tanks steamrolling through New Jersey!!!!  Details at eleven!

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar  4 16:57:30 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA03912
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 4 Mar 1994 16:44:17 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA11966 for <Genetic-Programming@list.stanford.edu>; Fri, 4 Mar 1994 13:23:58 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from cs.brandeis.edu (berry.cs.brandeis.edu) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA11226; Fri, 4 Mar 1994 13:22:54 -0800
Received: from rasp.cs.brandeis.edu by cs.brandeis.edu Fri, 4 Mar 1994 16:22:55 -0500
Received:  by rasp.cs.brandeis.edu (1.37.109.8/UofC3.0)
	id AA07169; Fri, 4 Mar 1994 16:22:54 -0500
Date: Fri, 4 Mar 1994 16:22:54 -0500
From: Patrick Tufts <zippy@cs.brandeis.edu>
Message-Id: <9403042122.AA07169@rasp.cs.brandeis.edu>
To: barton@latcs1.lat.oz.au
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <199403040544.AA21763@latcs4.lat.oz.au> (barton@latcs1.lat.oz.au)
Subject: Re: varying genome length
Status: RO

   From: barton@latcs1.lat.oz.au (Douglas P. Barton [Sun Dragon])
   Date: Fri, 4 Mar 1994 16:44:57 -0500

   I was just wondering whether there is much interest in EA systems
   that allow individuals to have varying genome lengths.  This would,
   I suppose, lead to the notion of species and one owuld have to
   consider the implications of species interaction etc...

Well, GP uses variable length representations (Lisp expressions).
There's also Craig Shaefer's ARGOT system, which sits on top of a
learning system - a GA, for example - and dynamically changes length
of the binary genome.

--Pat

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar  4 10:36:53 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA21132
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 4 Mar 1994 10:23:29 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id HAA11506 for <Genetic-Programming@list.stanford.edu>; Fri, 4 Mar 1994 07:18:27 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from vm1.ulaval.ca by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA27494; Fri, 4 Mar 1994 07:17:21 -0800
Message-Id: <199403041517.AA27494@Sunburn.Stanford.EDU>
Received: from VM1.ULAVAL.CA by VM1.ulaval.ca (IBM VM SMTP V2R2)
   with BSMTP id 7596; Fri, 04 Mar 94 10:16:47 EST
Received: from VM1.ULAVAL.CA (NJE origin RAHNJ@LAVALVM1) by VM1.ULAVAL.CA (LMail V1.1d/1.7f) with RFC822 id 7591; Fri, 4 Mar 1994 10:16:46 -0500
Date:         Fri, 04 Mar 94 10:11:53 EST
From: RAHNJ@VM1.ulaval.ca
Subject:      Re: varying genome length
To: GP list server <genetic-programming@cs.stanford.edu>
Status: RO

Isn't this what the ALifers at Santa Fe are all about? Mind you, their
programmes seem to have only survival as the fitness criterion. Their
anonymous ftp address is sfi.santafe.edu.


R. Joel Rahn
e-mail: RAHNJ@VM1.ULAVAL.CA
tel/til: 418/656 7163  & FAX: 418/656 2624
paper mail: F.Sc.Admin., Universiti Laval, Ste-Foy QC, G1K 7P4 CANADA

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar  4 07:18:35 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15061
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 4 Mar 1994 07:02:49 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id EAA11372 for <Genetic-Programming@list.stanford.edu>; Fri, 4 Mar 1994 04:18:15 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from sgigate.SGI.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA24058; Fri, 4 Mar 1994 04:17:11 -0800
Received: from relay.sgi.com (relay.sgi.com [192.26.51.36]) by sgigate.sgi.com (8.6.4/8.6.4) with SMTP id EAA09813; Fri, 4 Mar 1994 04:17:11 -0800
Received: from giraffe.asd.sgi.com by relay.sgi.com via SMTP (920330.SGI/920502.SGI)
	for @sgigate.sgi.com:genetic-programming@cs.stanford.edu id AA07570; Fri, 4 Mar 94 04:17:09 -0800
Received: from ivan.asd.sgi.com by giraffe.asd.sgi.com via SMTP (920330.SGI/920502.SGI)
	for @relay.sgi.com:genetic-programming@cs.stanford.edu id AA25684; Fri, 4 Mar 94 04:17:08 -0800
Received: by ivan.asd.sgi.com (930416.SGI/900721.SGI)
	for @giraffe.asd.sgi.com:genetic-programming@cs.stanford.edu id AA29819; Fri, 4 Mar 94 04:17:06 -0800
Date: Fri, 4 Mar 94 04:17:06 -0800
From: ib@ivan.asd.sgi.com (Ivan Bach)
Message-Id: <9403041217.AA29819@ivan.asd.sgi.com>
To: genetic-programming@cs.stanford.edu
Subject: Re: varying genome length
Status: RO

barton@latcs1.lat.oz.au (Douglas P. Barton [Sun Dragon]) writes:
> This would, I suppose, lead to the notion of species ...
In nature, new species apparently get created when groups of members of 
a species are separated and isolated for a long time.  Over time, they 
tend to diverge so much that the individuals from different groups are no
longer able to interbreed, i.e., produce viable offspring.  

There is definitely some competition going on between species in the
filling of available niches.  Until we started using our mailgroup, insects
seemed to be much better at creating new species than mammals.  There is also
an interdependence between species if, for example, one species uses another
species for food directly or indirectly.  If enough human beings with 
adequate gene pools travel to planets and stars, it is likely that new 
species will be created.  If enough insects travel with them, it is very 
likely that new species will be created, provided that they are able to
survive in new environments. 

Ivan Bach, ib@sgi.com
Silicon Graphics, Inc.
Mountain View, California

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Mar  4 05:45:20 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA13972
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 4 Mar 1994 05:39:40 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id CAA11312 for <Genetic-Programming@list.stanford.edu>; Fri, 4 Mar 1994 02:53:38 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from lifl.lifl.fr by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA20978; Fri, 4 Mar 1994 02:52:22 -0800
Received: from bock (bock.lifl.fr) by lifl.lifl.fr, Fri, 4 Mar 1994 11:49:10 +0100
Received: by bock, Fri, 4 Mar 1994 11:53:53 +0100
Date: Fri, 4 Mar 1994 11:53:53 +0100
From: Philippe.Preux@lifl.fr
Message-Id: <9403041053.AA08777@bock>
To: barton@latcs1.lat.oz.au
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <199403040544.AA21763@latcs4.lat.oz.au> (barton@latcs1.lat.oz.au)
Subject: Re: varying genome length
Status: RO


> Hi GPers,
> 
> I was just wondering whether there is much interest in EA systems that allow
> individuals to have varying genome lengths.  This would, I suppose, lead to the
> notion of species and one owuld have to consider the implications of species
> interaction etc...
>
> anyway, if anyone has any thoughts, I here :)
>

In fact, you question is twofold:
- You should take a look at ecological systems where several kinds of
  individuals are interacting, co-evolving and co-adaptating. Their
  genomes are of varying length

- About (almost pure) genetic algorithm-like systems with individuals
  having varying length genome, see also Goldberg's work on
  messy-GAs. However, in these systems, there is not strictly speaking
  several species of individuals.


Hope I helped you

Philippe

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar  3 17:16:58 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA08598
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 3 Mar 1994 17:05:58 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id OAA09077 for <Genetic-Programming@list.stanford.edu>; Thu, 3 Mar 1994 14:25:34 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from flubber.cc.utexas.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA04002; Thu, 3 Mar 1994 14:24:21 -0800
Received: by flubber.cc.utexas.edu id AA02952
  (5.65c/IDA-1.4.4 for genetic-programming@cs.stanford.edu); Thu, 3 Mar 1994 16:21:50 -0600
From: Jim McCoy <mccoy>
Message-Id: <199403032221.AA02952@flubber.cc.utexas.edu>
Subject: Re: GP systems overview
To: ib@ivan.asd.sgi.com (Ivan Bach)
Date: Thu, 3 Mar 1994 16:21:48 -0600 (CST)
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <9403031726.AA27210@ivan.asd.sgi.com> from "Ivan Bach" at Mar 3, 94 09:26:17 am
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 435       
Status: RO

> From: ib@ivan.asd.sgi.com (Ivan Bach)
> 
> What we really need is a URL (Universal Resource Locator) for GP, similar
> to the following URL for VR:

Try http://wwwhost.cc.utexas.edu/cc/staff/mccoy/gp/gp.html

It is mostly a html-ized version of the FAQ and is somewhat out of date as
our www server moved and I have not updated some of the HREFS, but I am
starting to do some more web stuff so it will eventually get better... 

jim

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar  3 17:09:40 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA07987
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 3 Mar 1994 16:49:29 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id OAA08999 for <Genetic-Programming@list.stanford.edu>; Thu, 3 Mar 1994 14:03:21 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from ai.iit.nrc.ca (itisgate.nrc.ca) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA03105; Thu, 3 Mar 1994 14:02:12 -0800
Message-Id: <9403032201.AA25249@ai.iit.nrc.ca>
Date: Thu, 3 Mar 94 17:01:38 EST
From: Peter Turney <peter@ai.iit.nrc.ca>
To: genetic-programming@cs.stanford.edu
Subject: Re: GP systems overview
Cc: peter@ai.iit.nrc.ca
Status: RO


> 
> What we really need is a URL (Universal Resource Locator) for GP, similar
> to the following URL for VR:
> 
> .....
> 
> Ivan Bach, ib@sgi.com
> Silicon Graphics, Inc.
> Mountain View, California 
> 

This might be of interest to GPers:


---------------------------------------------------------------------------


		The Knowledge Systems Laboratory Announces
			A New World Wide Web Server
		with an Emphasis on Machine Learning Resources


The World Wide Web is a hypermedia document that spans the Internet.
If you are not familiar with the Web, you can get acquainted by
obtaining a free copy of Mosaic from the National Center for
Supercomputing Applications. For example, if you have a Sun:

	unix> ftp ftp.ncsa.uiuc.edu

	Name: anonymous
	Password: <your e-mail address>

	ftp> cd Mosaic/Mosaic-binaries
	ftp> binary
	ftp> get Mosaic-sun.Z
	ftp> bye

	unix> uncompress Mosaic-sun.Z
	unix> Mosaic-sun

Like ftp (file transfer protocol), the World Wide Web is based on a
protocol for the exchange of files over the Internet. The protocol is
known as http (hypertext transfer protocol). This new protocol subsumes
older protocols, such as ftp and gopher. Mosaic uses http to
provide a hypermedia interface (text, hypertext, graphics, movies,
audio) to the Internet.

The Knowledge Systems Laboratory of the National Research Council
of Canada has set up a World Wide Web Server (analogous to an
ftp server) that delivers information relevant to AI researchers,
especially machine learning researchers. If you have Mosaic,
you may access the KSL server using the URL:

	http://ai.iit.nrc.ca/home_page.html

Please let me know if you have any suggestions about information
that could be added to our server. Feedback of any kind is most
welcome.


---------------------------------------------------------------------------
 ___    __    _____        ____    |
/_ /\  /_/|  /____/ \    /___ /|   | Peter D. Turney  (peter@ai.iit.nrc.ca)
| |\ \ | || |  __ \ /|  / ___|/    | Knowledge Systems Laboratory
| ||\ \| || | |__) |/  | | |__     | National Research Council Canada
| || \   || |  __  /\  | |/__ /|   | Ottawa, Ontario, Canada, K1A 0R6
|_|/  \__|/ |_|/ \_\/   \____|/    | (613) 993-8564  FAX: 952-7151

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar  3 14:31:12 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA01884
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 3 Mar 1994 14:18:55 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA07900 for <Genetic-Programming@list.stanford.edu>; Thu, 3 Mar 1994 11:14:40 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from mail.netcom.com (netcom6.netcom.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA25597; Thu, 3 Mar 1994 11:13:36 -0800
Received: from localhost by mail.netcom.com (8.6.4/SMI-4.1/Netcom)
	id LAA09452; Thu, 3 Mar 1994 11:14:16 -0800
Date: Thu, 3 Mar 1994 11:14:16 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199403031914.LAA09452@mail.netcom.com>
To: keller@trurl.informatik.uni-dortmund.de
Cc: genetic-programming@cs.stanford.edu,
        keller@trurl.informatik.uni-dortmund.de
In-Reply-To: <9403031659.AA02202@trurl.informatik.uni-dortmund.de> (message from Robert Keller on 03 Mar 1994 17:59:15 +0100)
Subject: Re: GP systems overview
Status: RO

most of them can be had from ftp.cc.utexas.edu.  Andy's code is
documented in what I would consider a sales pitch ;-) published in
Byte.  SGPC is relatively poorly documented with the exception that I
and many others have USED it ex- tensively in our published works.  It
also has the advantage of being thoroughly run through LINT and PURIFY, and 
cross-verified across many systems to ensure it gets the same answers on
all of them.  Most people say the code is OK, but maybne they are just 
beiing polite.


-walter tackett

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar  3 12:41:19 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA27885
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 3 Mar 1994 12:24:14 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA07699 for <Genetic-Programming@list.stanford.edu>; Thu, 3 Mar 1994 09:27:26 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from sgigate.SGI.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA20877; Thu, 3 Mar 1994 09:26:21 -0800
Received: from relay.sgi.com (relay.sgi.com [192.26.51.36]) by sgigate.sgi.com (8.6.4/8.6.4) with SMTP id JAA01719; Thu, 3 Mar 1994 09:26:20 -0800
Received: from giraffe.asd.sgi.com by relay.sgi.com via SMTP (920330.SGI/920502.SGI)
	for @sgigate.sgi.com:genetic-programming@cs.stanford.edu id AA09042; Thu, 3 Mar 94 09:26:19 -0800
Received: from ivan.asd.sgi.com by giraffe.asd.sgi.com via SMTP (920330.SGI/920502.SGI)
	for @relay.sgi.com:genetic-programming@cs.stanford.edu id AA18207; Thu, 3 Mar 94 09:26:18 -0800
Received: by ivan.asd.sgi.com (930416.SGI/900721.SGI)
	for @giraffe.asd.sgi.com:genetic-programming@cs.stanford.edu id AA27210; Thu, 3 Mar 94 09:26:17 -0800
Date: Thu, 3 Mar 94 09:26:17 -0800
From: ib@ivan.asd.sgi.com (Ivan Bach)
Message-Id: <9403031726.AA27210@ivan.asd.sgi.com>
To: genetic-programming@cs.stanford.edu
Subject: Re: GP systems overview
Status: RO

What we really need is a URL (Universal Resource Locator) for GP, similar
to the following URL for VR:
---------------------------------------------------------------------------
From: Rob Kooper <kooper@cc.gatech.edu>
Newsgroups: comp.archives
Subject: [sci.virtual-worlds] VR and WWW.
Date: 31 Jan 1994 09:34:36 +0100
Organization: College of Computing, Georgia Tech

Archive-Name: auto/sci.virtual-worlds/VR-and-WWW

Hi,

I'm creating a page with pointers to WWW pages which have to do with
VR. I know of the existence of three of them, here at GVU at NCSA (The
CAVE) and at Delft. There are probably more of them out there.

If you know any let me know and I will add them. This way it is easy
to see what is going on in VR research.

The URL for this page is:
        http://www.gatech.edu/gvu/people/Masters/Rob.Kooper/Meta.VR.html

Rob

                                        -Rob Kooper (kooper@cc.gatech.edu)

[Co-mod's note: as you all might expect, we are preparing various
 HTML documents to provide pointers to all the information on VR
 that we know of on the net.  Unfortunately, our WWW site is not
 globally accessible yet, but we will let you know when it is.  Stay tuned.
 -- Aaron Kaleva Pulkka, scivw co-mod.]
-------------------------------------------------------------------------

Ivan Bach, ib@sgi.com
Silicon Graphics, Inc.
Mountain View, California 

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Mar  3 12:16:54 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA27394
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 3 Mar 1994 12:10:28 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA07573 for <Genetic-Programming@list.stanford.edu>; Thu, 3 Mar 1994 09:00:39 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from waldorf.Informatik.Uni-Dortmund.DE by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA19989; Thu, 3 Mar 1994 08:59:24 -0800
Received: from trurl.informatik.uni-dortmund.de
	by waldorf.informatik.uni-dortmund.de with SMTP (Sendmail 8.6.5/UniDo 2.0.14)
        id RAA18494; Thu, 3 Mar 1994 17:59:16 +0100
From: Robert Keller <keller@trurl.informatik.uni-dortmund.de>
Date: Thu, 3 Mar 94 17:59:15 +0100
Message-Id: <9403031659.AA02202@trurl.informatik.uni-dortmund.de>
Received: by trurl.informatik.uni-dortmund.de id AA02202; Thu, 3 Mar 94 17:59:15 +0100
To: genetic-programming@cs.stanford.edu
Subject: GP systems overview
Cc: keller@trurl.informatik.uni-dortmund.de
Status: RO

All

is there any survey of PD GP implementations (SGPC, GPQUICK etc) and
where to get them?


Robert

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 15:23:40 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA29144
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 15:12:32 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id MAA00605 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 12:24:19 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from HPP.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA16335; Mon, 28 Feb 1994 12:23:15 -0800
Received: from KSL-EXP-35 (KSL-EXP-35.Stanford.EDU) by HPP.Stanford.EDU (4.1/inc-1.0)
	id AA17407; Mon, 28 Feb 94 12:23:10 PST
Message-Id: <2971455768-13145305@KSL-EXP-35>
Sender: RICE@KSL-EXP-35.Stanford.EDU
Date: Mon, 28 Feb 94  12:22:48 PST
From: James Rice <Rice@HPP.Stanford.EDU>
To: smaxwell@wpo.borland.com, dudeyp@chert.CS.ORST.EDU
Cc: genetic-programming@cs.stanford.edu
Subject: Re: Diversity and sexiness in GA/GP
In-Reply-To: <sd71cbbe.086@wpo.borland.com>
Status: RO


--> >"We feel confident that there is a 4-line LISP
--> > hack for consciousness."

--> Hey, I'd learn LISP all over again, if I can get a
--> look at this hack!

--> -+- Sid

Actually, Mr D was just being kind to the rest of the
world, who might be incredulous of the real state of
affairs.  Any serious Lisp hacker knows that the 4-line
hack for conciousness is for wimps.  There's a one-line
hack for conciousness (and pizza) written in CL's format
language.  It starts something like

  (format t "~@:{~^~:[~

I forget the rest.


Rice - Since we were on the subject of Lisps for CMs, I
       once saw the format control string for the print
       method for Xets (or maybe Xappings or Xectors,
       can't remember), I think Moon posted it on the CL
       mailing list ~8 years ago.  Really made your eyes
       pop out.

*** All Un/Subscribe messages should go to      ***
*** genetic-programming-REQUEST@cs.stanford.edu ***
***                    ^^^^^^^^                 ***

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 14:05:52 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA26158
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 13:44:04 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA00310 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 10:58:45 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA11710; Mon, 28 Feb 1994 10:57:40 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA21261; Mon, 28 Feb 94 10:57:37 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA01743; Mon, 28 Feb 94 10:57:37 PST
Date: Mon, 28 Feb 94 10:57:37 PST
Message-Id: <9402281857.AA01743@hume.CS.ORST.EDU>
To: smaxwell@wpo.borland.com
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: smaxwell@wpo.borland.com's message of Mon, 28 Feb 1994 10:44:09 -0800 <sd71cbbe.087@wpo.borland.com>
Subject: Diversity and sexiness in GA/GP
Status: RO

 > Date: Mon, 28 Feb 1994 10:44:09 -0800
 > From: smaxwell@wpo.borland.com
 > 
 > > Won't idiot savants (just the sort of individual with which an absent-
 > > minded professor would want to breed, genetically speaking) be culled
 > > out by the original tournaments?
 > 
 > Not if they'd win a tournament.  An idiot savant would be more likely to
 > win a tournament than a [plain] idiot.  I'm assuming that by "idiot savant"
 > you mean an individual who does very well at some subset of a problem,
 > but otherwise poorly.

Yes, that's what I meant.

 > > How?  If both parent were idiots, how would they get past the >
 > tournaments?
 > 
 > It's unlikely that idiots would get past the tournament, which is why we
 > use them (;-).  However, idiot savants *can* get by them.  Pair two of

I don't see how.  Let's say there are 20 fitness test cases, each
worth up to ten points, and total fitness is the sum of these scores.
If you get 10 points on two of the cases, and 1 on each of the others,
you'll still lose to evenly shoddy individuals who get 2 points on
each case.

 > >   "We feel confident that there is a 4-line LISP hack for consciousness."
 > 
 > Hey, I'd learn LISP all over again, if I can get a look at this hack!

Well, you have to load a few packages first...

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\   "We feel confident that there is a 4-line LISP hack for consciousness."   /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 13:49:05 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA25825
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 13:32:24 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA00292 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 10:49:50 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from wpo.borland.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA11231; Mon, 28 Feb 1994 10:48:45 -0800
Received: from Borland-Message_Server by wpo.borland.com
	with WordPerfect_Office; Mon, 28 Feb 1994 10:47:26 -0800
Message-Id: <sd71cbbe.086@wpo.borland.com>
X-Mailer: WordPerfect Office 4.0
Date: Mon, 28 Feb 1994 10:44:09 -0800
From: smaxwell@wpo.borland.com
To: dudeyp@chert.CS.ORST.EDU
Cc: genetic-programming@cs.stanford.edu
Subject:  Diversity and sexiness in GA/GP
Status: RO

> That sounds very similar to Mike Keith's sexiness formula.  I don't
> suppose there's any way you could find that reference, is there?

I was refering to multiple tournaments, as described in Andy Singleton's
2-Jan message "Tournament Selection" follow up of his Pareto optimality
discussion.

> Won't idiot savants (just the sort of individual with which an absent-
> minded professor would want to breed, genetically speaking) be culled
> out by the original tournaments?

Not if they'd win a tournament.  An idiot savant would be more likely to
win a tournament than a [plain] idiot.  I'm assuming that by "idiot savant"
you mean an individual who does very well at some subset of a problem,
but otherwise poorly.

> How?  If both parent were idiots, how would they get past the >
tournaments?

It's unlikely that idiots would get past the tournament, which is why we
use them (;-).  However, idiot savants *can* get by them.  Pair two of
these, and you might result in a genius (gets the "savant" parts from both
parents) or an idiot (gets the "idiot" parts), or anywhere in between.  Ta
da, diversity.

>   "We feel confident that there is a 4-line LISP hack for consciousness."

Hey, I'd learn LISP all over again, if I can get a look at this hack!

-+- Sid

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 12:54:18 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA23936
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 12:37:14 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA00169 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 09:31:57 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from sun2.nsfnet-relay.ac.uk by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA06132; Mon, 28 Feb 1994 09:30:51 -0800
Via: uk.ac.sunderland.consgate; Mon, 28 Feb 1994 16:34:53 +0000
Via: isis.sunderland.ac.uk (isis.sund.ac.uk); Mon, 28 Feb 1994 16:32:39 +0000
Received: by isis.sunderland.ac.uk (4.1/SMI-4.1) id AA02741;
          Mon, 28 Feb 94 16:33:47 GMT
From: cs0ral@isis.sunderland.ac.uk (r.aler)
Message-Id: <9402281633.AA02741@isis.sunderland.ac.uk>
Subject: no subject (file transmission)
To: genetic-programming@cs.stanford.edu (genetic)
Date: Mon, 28 Feb 1994 16:33:47 +0000 (GMT)
X-Mailer: ELM [version 2.4 PL22]
Content-Type: text
Content-Length: 6517
Status: RO

> >The diversity of the population is controlled by the value of sigma.   
> >If sigma is 1, then you have plain vanilla GA/GP.  If sigma is equal  
> >to the size of the population, you get a whole pile of  
> >subpopulations, each of whose size is proportional to fitness at that  
> >hill.  Thus if you have two equally sized hills, you'd expect two  
> >subpopulations of equal size.  For values of sigma in between, you  
> >get a progressively smaller number of subpopulations.
> 
> This is a cool idea and is similar in some ways to the sexiness
> concept. However, it seems that you would end up with a lot of
> individuals who are only good for one test point. You would end
> up perhaps overfiting the problem.

	What about this?. For every individual select those case tests
where the individual is good (for instance, if fm is the worst fitness of a
individual and fM is the best (I am talking of fitnesses for single case
test) then select those case tests where
 fM-f(individual,case test)<(fM-fm)/2 ). Then perform reproduction and
crossover taking into account the fitness only for those selected
case tests. Do this for each individual except for those individuals
whose case test group has already been considered. By doing this you
are dividing the case test set in problem subsets (subproblems) and evaluating
all the individuals inside those subsets. This way you are making sure
you don't loose individuals which are good for certain parts of the
problem but not good in general. Note that problem subsets are formed
dynamically (The only constant the user is selecting is 1/2). Also, individuals
will tend to breed with other individuals which are good at the same
problem subset solving the problem of mixing very different individuals.

	Now you have the problem of what individuals must be rejected.
I think that those individuals which are not neccessary (i.e., if we
reject them, any of the problem subsets will loose a top-fit 
individual). For doing this, we rank individuals for each problem
subset and assign 1 point to the individual in the top, 2 points
to the second, and so on. The total number of points for each individual
will be the number of points obtained in subset1 plus points in 
subset2 and so on. This way, the individuals which are at the bottom
in many subsets (i.e. they have got many points) will be rejected (no matter 
they are very fit) and therefore maintaining diversity.
	I am not saying it is good to reject very fit individuals. If you want
to maintain a lot of very fit individuals in a problem subset, then you
have to increase your total population.

> 
> In addition, this concept smells more like a classifier system to
> me. Your solution is not an individual but a collection of
> individuals. In GP, its nice to look at the solution program
> in one individual.

	With the difference that classifiers are using non-genetic
self-organisation (i.e., the bucket brigade algorithm). With the idea
I proposed above I expect that situations like the following will happen:

	Ind1 is good in problem subset P1 P2 P3 P4 (where Pi are case tests)
	Ind2 is good in problem subset P3 P4 P5 P6

	Therefore Ind1 and Ind2 will breed together sometimes and perhaps
we will get an individual which is good in P1 P2 P3 P4 P5 P6 (This is the
basic idea of GP anyway). Problem subsets should tend to get bigger and
bigger and eventually to solve the whole problem. On the other side, it
looks that there is no real pressure for an individual to increase its
problem subset size and as it is easier to solve one problem than to 
solve two, then what we would get would be a bunch of overspecialized
individuals. But in this scheme, the only way of not being rejected is
to find new niches (i.e., problem subsets). For instance, if we have:

	Ind1, fitness(P1)=10, fitness(Pi, i<>1)=0
	Ind2, fitness(P2)=10, fitness(Pi, i<>2)=0
	Ind3, fitness(P1)=6, fitness(P2)=6

	In this case, Ind3 is better than Ind1 and Ind2 in subset P1, P2 although is
not very good in any of them but it still would have a chance of surviving because
it is the top individual of its own subset.

> 
> Nick, how do they compbine the individuals together so that they
> know the "group-solution" is doing ??

	In the scheme above, I expect a combined solution to appear in a single
individual. But sometimes, a problem is made of very different subproblems and
the best way to combine them it is just to put them together and activate one
of them depending of the problem. For instance, in nature it would be quite useful to
be able to feed from earth (like plants), shit ( pardon : ) ) like fungus, meat, grass
(like rabbits), etc, but you don't find any being on Earth able to do all this things
at the same time (except some children, of course). Instead of, you find adapted individuals
to some kind of food or you find raw combination of these individuals (like the combination
of a couple of aerobic and anaerobic bacteria to form an Eucaryote cell or the combination
human being-rabbit where the rabbit eats the grass and the human eats the rabbit : ) ).

I think it would be a good idea to evolve a subproblem classifier (given a specific problem
it gives you to what problem subset it belongs). Note that with my scheme the individuals themselves
are dividing the problem space in problem subsets. Therefore, given a specific problem, you
first have to determine to which problem subset this specific problem belongs and then
activate the top individual for that problem subset. How to evolve a problem classifier?
Perhaps, after the problem space has been clearly divided in problem subsets, another kind of
evolution should take place: the individuals from the past evolution should be used as 
subroutines (like in an AFD scheme) and new functions to activate those subroutines should be included
in the soup. The same sensor functions used by the original individuals could be included in the
new soup to extract characteristics from the problem and therefore classify it. If we use an AFD
scheme, the old individuals (now subroutines) would be allowed to evolve too further and further!.

	This has been a really long message. Perhaps I was just thinking loudly ... Hope you 
understand it, I know my English is not very good. Suggestions, ideas, money and spare time
in supercomputers y/o parallel machines is welcome : ) 

			Ricardo Aler
			Room D3A, School of Computing, Priestman Building
			University of Sunderland
			Sunderland (UK)

			e-mail: cs0ral@isis.sunderland.ac.uk

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 12:09:48 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA22641
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 11:57:27 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id IAA20959 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 08:19:49 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from cs.brandeis.edu (berry.cs.brandeis.edu) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA02473; Mon, 28 Feb 1994 08:18:45 -0800
Received: from hack.cs.brandeis.edu by cs.brandeis.edu Mon, 28 Feb 1994 11:19:00 -0500
Received:  by hack.cs.brandeis.edu (1.37.109.8/UofC3.0)
	id AA19098; Mon, 28 Feb 1994 11:18:59 -0500
Date: Mon, 28 Feb 1994 11:18:59 -0500
From: Patrick Tufts <zippy@cs.brandeis.edu>
Message-Id: <9402281618.AA19098@hack.cs.brandeis.edu>
To: aries@media.mit.edu
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <9402220413.AA14060@media.mit.edu> (aries@media.mit.edu)
Subject: Re: *LISP
Status: RO

   Date: Mon, 21 Feb 94 23:13:05 -0500
   From: "Michael P. Johnson" <aries@media.mit.edu>


   Hello.  Perhaps someone can help me.  I am looking for a GP library for
   *LISP, on the Connection Machine, or a reason why I would be able to
   write one.  

I've done GP on a CM-5 in *Lisp.  It works in data parallel mode
(SIMD), which is fine if you want to evaluate individuals using
fitness cases.  I've worked with populations of 10000 individuals with
~10000 fitness cases and gotten reasonable performance.

[....]

   Can the CM do this?  I know very little about them.  My guess is that
   the CM5 (MIMD) could but the CM2 (SIMD) probably couldn't.  Is this
   intuition right?  (That would be too bad since I could get megatime on a
   CM2, not as much on a CM5).

Both the CM-2 and the CM-5 are fine as long as you're want to evaluate
test cases in parallel.  If you want to evaluate *organisms* in
parallel, then the CM-2 won't buy you anything.  Neither will the CM-5
with *Lisp, since *Lisp doesn't support MIMD mode.  You'd probably
want to migrate to C* or FORTRAN.

   If anyone has code, experience, or suggestions on this, please let me
   know.  I am eager to start fiddling again and I think my serial version
   is optimized as much as possible.  Besides, GP is MADE to be parallel!
   It seems blasphemous to run it on a serial machine (unless, of course,
   you have 100 such machines available...)

   Thanks,

   -Mike
   aries@media.mit.edu

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 07:45:24 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15291
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 07:33:19 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id EAA20142 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 04:50:52 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from odin.icd.ab.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA06298; Mon, 28 Feb 1994 04:49:43 -0800
Received: from gadwal.icd.ab.com (gadwal.icd.ab.com [130.151.132.71]) by odin.icd.ab.com (8.1C/5.6) with SMTP id HAA03264; Mon, 28 Feb 1994 07:49:37 -0500
Date: Mon, 28 Feb 1994 07:49:37 -0500
From: "Mike J. Keith" <keithm@icd.ab.com>
Message-Id: <199402281249.HAA03264@odin.icd.ab.com>
To: mcphee@nxsci245.mrs.umn.edu
Subject: Re: Diversity and sexiness in GA/GP
Cc: genetic-programming@cs.stanford.edu
Status: RO

>The diversity of the population is controlled by the value of sigma.   
>If sigma is 1, then you have plain vanilla GA/GP.  If sigma is equal  
>to the size of the population, you get a whole pile of  
>subpopulations, each of whose size is proportional to fitness at that  
>hill.  Thus if you have two equally sized hills, you'd expect two  
>subpopulations of equal size.  For values of sigma in between, you  
>get a progressively smaller number of subpopulations.

This is a cool idea and is similar in some ways to the sexiness
concept. However, it seems that you would end up with a lot of
individuals who are only good for one test point. You would end
up perhaps overfiting the problem.

In addition, this concept smells more like a classifier system to
me. Your solution is not an individual but a collection of
individuals. In GP, its nice to look at the solution program
in one individual.

Nick, how do they compbine the individuals together so that they
know the "group-solution" is doing ??

Mike

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 05:14:45 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA13245
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 05:02:55 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id CAA19569 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 02:28:31 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from netcom8.netcom.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA01030; Mon, 28 Feb 1994 02:27:28 -0800
Received: from localhost by netcom8.netcom.com (8.6.4/SMI-4.1/Netcom)
	id CAA02874; Mon, 28 Feb 1994 02:28:20 -0800
Date: Mon, 28 Feb 1994 02:28:20 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199402281028.CAA02874@netcom8.netcom.com>
To: phred@leland.Stanford.EDU
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <199402271203.EAA07414@elaine8.Stanford.EDU> (message from David Andre on Sun, 27 Feb 1994 04:03:37 -0800 (PST))
Subject: Re: printing in Unix...
Status: RO

"phred" (aka dave) writes:
> 
> Walter, 
> 
> I noticed that in sgpc the output goes to the redirected files 
> very quickly -- Basically I can scan the files as the run 
> progresses to see how things are doing, when using sgpc.
> 
> Now, in my new core, I write output to files, but it doesnt actually
> show up in the files until the end of the run...Is there some way that
> you caused sgpc to write out to files more often?
> 
> Thanks for any light you could shed.  
> 
> David

there's some command like "setbuf" that controls how often files get
written out. I f you set it to \0, there is no buffering done - if you
dig thru SGPC i think you will find it. Most systems , e.g., even
Borland , have some kind of mechanism for doing this.  I do it as a
matter of course-- we used to have an Alliant that by default would
buffer about 2k- so when your job crashes you got no clue what
happened- probably why they are out of business now.  Also, don't
trust my word that the command is called setbuf -hang on -in BC4 it's
called setbuf().  In unix (at least in BSD) it's called setbuffer- the
params are slightly different.  If your jobs are anything like mine,
then they are compute-bound anyhow, so buffering doesnt' buy you much.
certainly doesn't buy *me* much.  hope this helps.  Funny, i get a lot
of mail from people who say i could improve the speed of SGPC by
improving the IO.  What the heck kind of toy problems are they working
on?  Maybe they are dumping the whole population?  (sys op must LOVE
them if they are doing that!!!!!)  well, that's enuf outta me.  hope
it helps!

-wt

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 28 03:00:57 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA11957
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 28 Feb 1994 02:54:17 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id AAA19039 for <Genetic-Programming@list.stanford.edu>; Mon, 28 Feb 1994 00:05:56 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from netcomsv.netcom.com (uucp2-b.netcom.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA05480; Mon, 28 Feb 1994 00:04:48 -0800
Received: from red.com by netcomsv.netcom.com with UUCP (8.6.4/SMI-4.1)
	id XAA11188; Sun, 27 Feb 1994 23:31:08 -0800
Received: by red.com (920330.SGI/921111.SGI.AUTO.ANONFTP)
	for @netcomsv:forrest@cs.unm.edu id AA01092; Sun, 27 Feb 94 23:28:05 -0800
From: cwr@red.com (Craig W. Reynolds)
Message-Id: <9402272328.ZM1090@red.com>
Date: Sun, 27 Feb 1994 23:28:02 -0800
In-Reply-To: <9402280048.AA01613@nxsci245.mrs.umn.edu>
X-Mailer: Z-Mail (2.1.4 02apr93)
To: mcphee@nxsci245.mrs.umn.edu (Nic McPhee)
Subject: Re: Diversity and sexiness in GA/GP
Cc: genetic-programming@cs.stanford.edu, forrest@cs.unm.edu, cwr@red.com
Status: RO

    Date: Sun, 27 Feb 94 18:48:02 CST
    From: mcphee@nxsci245.mrs.umn.edu (Nic McPhee)

    Since no one appears to have mentioned this yet, I thought I'd
    point out that there's an interesting article on the subject of
    diversity in the 2nd issue of _Evolutionary Computation_:

    	"Searching for diverse, cooperative populations with
    	 genetic algorithms", by Smith, Forrest, and Perelson

    This is then (sort of) followed up in the next issue.  Note that
    while they're working with GA's, the idea (summarized below) could
    be easily applied to GP as well...

Thanks Nic for pointing this out.  Indeed the "follow up" paper was:

 S. Forrest, B. Javornik, R. Smith, and A. Perelson (1993) Using
 Genetic Algorithms to Explore Pattern Recognition in the Immune
 System, _Evolutionary Computation_, 1(3), 191-211.

Note that the goal there was not to maintain diversity while searching
for a single solution to a problem, but to breed a cooperative
POPULATION which worked as a team to solve a family of problems.  That
is, a collection of antibodies to fend off a collection of antigens.

(Side note about the dynamics of our virtual community: When I made
the connection and realized that Nic was talking about the immune
system stuff, I had a vague memory of this topic coming up on the GP
list before.  I poked around and found what I was remembering.  By an
amazing coincidence, it was almost exactly one year ago, on March 2,
1993 that this came up.  In a discussion about diversity (with the
subject "DEMES") I mentioned that I'd recently heard Prof. Forrest
talk about a GA model of the immune system but I didn't recall the
details.  There were some replies supplying the details from Ron
Goldthwaite and Lee Altenberg, two members of the GP community who
were unknown to me at the time, but who I've since come to know and
respect.)

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 27 20:00:29 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15708
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 27 Feb 1994 19:46:30 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id QAA17339 for <Genetic-Programming@list.stanford.edu>; Sun, 27 Feb 1994 16:59:44 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from nxsci245.mrs.umn.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA22140; Sun, 27 Feb 1994 16:58:40 -0800
Received: by nxsci245.mrs.umn.edu (NeXT-1.0 (From Sendmail 5.52)/NeXT-2.0)
	id AA01613; Sun, 27 Feb 94 18:48:02 CST
Date: Sun, 27 Feb 94 18:48:02 CST
From: mcphee@nxsci245.mrs.umn.edu (Nic McPhee)
Message-Id: <9402280048.AA01613@nxsci245.mrs.umn.edu>
Received: by NeXT Mailer (1.63)
To: genetic-programming@cs.stanford.edu
Subject: Re: Diversity and sexiness in GA/GP
Status: RO

Since no one appears to have mentioned this yet, I thought I'd point  
out that there's an interesting article on the subject of diversity  
in the 2nd issue of _Evolutionary Computation_:

	"Searching for diverse, cooperative populations with 

	     genetic algorithms", by Smith, Forrest, and Perelson

This is then (sort of) followed up in the next issue.  Note that  
while they're working with GA's, the idea (summarized below) could be  
easily applied to GP as well.

The idea in a nutshell:

    When it comes time to compute fitnesses, repeatedly 

	
	0.  Select a certain number (say sigma) of individuals
	1.  Select a fitness case
	2.  Compute the fitness of each of the sigma individuals on 

	    this case
	3.  Reward the best of the lot by adding its score to its 

	    current fitness

The diversity of the population is controlled by the value of sigma.   
If sigma is 1, then you have plain vanilla GA/GP.  If sigma is equal  
to the size of the population, you get a whole pile of  
subpopulations, each of whose size is proportional to fitness at that  
hill.  Thus if you have two equally sized hills, you'd expect two  
subpopulations of equal size.  For values of sigma in between, you  
get a progressively smaller number of subpopulations.

I played with it a bit in a GA context, and found it worked quite  
nicely without being overly expensive (some of the hamming difference  
difference approaches are O(N^2)).

	Nic McPhee
	mcphee@cda.mrs.umn.edu
	University of Minnesota, Morris

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 27 16:44:42 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA12127
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 27 Feb 1994 16:38:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA16561 for <Genetic-Programming@list.stanford.edu>; Sun, 27 Feb 1994 13:52:58 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from cs.columbia.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA19498; Sun, 27 Feb 1994 13:51:54 -0800
Received: from age.cs.columbia.edu (age.cs.columbia.edu [128.59.1.6]) by cs.columbia.edu (8.6.4/8.6.4) with ESMTP id QAA00352 for <genetic-programming@cs.stanford.edu>; Sun, 27 Feb 1994 16:50:57 -0500
Received: from localhost (evs@localhost) by age.cs.columbia.edu (8.6.4/8.6.4) id QAA06241; Sun, 27 Feb 1994 16:49:44 -0500
Date: Sun, 27 Feb 94 16:49:43 EST
From: Eric Siegel <evs@cs.columbia.edu>
To: cs0ral@isis.sunderland.ac.uk (r.aler)
Cc: genetic-programming@cs.stanford.edu, cs0ral@orac.sunderland.ac.uk
Subject: RE: Network parallelism
In-Reply-To: Your message of Tue, 22 Feb 1994 14:33:39 +0000 (GMT)
Message-Id: <CMM.0.90.2.762385783.evs@age.cs.columbia.edu>
Status: RO

> 	I've built a parallel version for SGPC by using PVM. The configuration
> is made of a master program (where all genetic operations take place) and
>one server for every machine (where fitnesses are calculated). Individuals are
> sent from the master to the servers and servers answer with fitness.  

How long does your (current) fitness measure take, and how long does each
master-server transaction take?  Inquiring minds want to know.

Thank you,
Eric

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 27 16:29:44 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA11891
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 27 Feb 1994 16:28:09 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA16458 for <Genetic-Programming@list.stanford.edu>; Sun, 27 Feb 1994 13:25:28 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from anl.gov (dns2.anl.gov) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA19112; Sun, 27 Feb 1994 13:24:24 -0800
Received: from camelot.es.anl.gov by anl.gov (4.1/SMI-4.1)
	id AA24650; Sun, 27 Feb 94 15:24:23 CST
Received: by camelot.es.anl.gov (4.1/SMI-4.0)
	id AA11885; Sun, 27 Feb 94 15:30:36 CST
Date: Sun, 27 Feb 94 15:30:36 CST
From: burke@camelot.es.anl.gov (Jay Burke)
Message-Id: <9402272130.AA11885@camelot.es.anl.gov>
To: genetic-programming@cs.stanford.edu
Subject: Looking for pointers on PC-Beagle
Status: RO

Hi,

  I am new to this list and am looking forward to using genetic
programming in some of my work.  I have heard a little bit
about the Software PC-Beagle and I am interested in purchasing
it from the author Richard Forsyth.

  If I could get any help in getting in contact with the author
about purchasing this, it would be greatly appreciated.  Please
contact me at the below e-mail address.

  Thanks in advance.

Jay BUrke
jay@anl.gov

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 27 14:29:34 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09500
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 27 Feb 1994 14:22:35 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA16061 for <Genetic-Programming@list.stanford.edu>; Sun, 27 Feb 1994 11:48:50 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from balder.cs.wisc.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA17540; Sun, 27 Feb 1994 11:47:46 -0800
From: derek@cs.wisc.edu (Derek Zahn)
Message-Id: <9402271947.AA19545@balder.cs.wisc.edu>
Received: by balder.cs.wisc.edu; Sun, 27 Feb 94 13:47:45 -0600
Subject: tree distances
To: genetic-programming@cs.stanford.edu
Date: Sun, 27 Feb 1994 13:47:44 -0600 (CST)
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 329       
Status: RO


Vasant Honavar:

> several tree distance measures based on purely structural
> or syntactic criteria have been around for a long time in syntactic
> pattern recognition.

See, e.g., Fu and Lu, A Clustering Procedure for Syntactic Patterns.
IEEE Transactions on Systems, Man, and Cybernetics, v10 (Oct 1977),
pp 734-742.

derek

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 27 12:59:52 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA07753
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 27 Feb 1994 12:47:49 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA15643 for <Genetic-Programming@list.stanford.edu>; Sun, 27 Feb 1994 10:08:54 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA16132; Sun, 27 Feb 1994 10:07:50 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA12729; Sun, 27 Feb 94 10:07:48 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA00653; Sun, 27 Feb 94 10:07:48 PST
Date: Sun, 27 Feb 94 10:07:48 PST
Message-Id: <9402271807.AA00653@hume.CS.ORST.EDU>
To: smaxwell@wpo.borland.com
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: smaxwell@wpo.borland.com's message of Fri, 25 Feb 1994 16:38:52 -0800 <sd6e2a6c.075@wpo.borland.com>
Subject: Diversity and sexiness in GA/GP
Status: RO

 > Date: Fri, 25 Feb 1994 16:38:52 -0800
 > From: smaxwell@wpo.borland.com
 > 
 > What if we were to combine the sexiness idea with the order-2
 > tournaments idea (sorry, forgot the reference):
 > 
 > For each paring, conduct n sets of m [normal] fitness tournaments,
 > resuling in n winners.  Compute the distance* between each pair (n
 > taken 2 at a time).  The pair with the largest distance get to fool around.

That sounds very similar to Mike Keith's sexiness formula.  I don't
suppose there's any way you could find that reference, is there?

 > Every winner is, by definition, of high fitness.  Large distance insures
 > that each of the pair has something the other could use (opposites
 > attract), and encourages (or at least capitalizes) diversity.

Won't idiot savants (just the sort of individual with which an
absent-minded professor would want to breed, genetically speaking) be
culled out by the original tournaments?

 > The results of such pairings are likely to be more diverse, too, I think. 
 > We're as likely to get a genius with high scores from both parents as a
 > idiot with low scores from both.

How?  If both parent were idiots, how would they get past the
tournaments?

 > * Distance here is the sum of squares difference per fitness case.

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\   "We feel confident that there is a 4-line LISP hack for consciousness."   /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 27 06:43:59 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA00963
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 27 Feb 1994 06:35:04 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id CAA13977 for <Genetic-Programming@list.stanford.edu>; Sun, 27 Feb 1994 02:15:48 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from netcom8.netcom.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA05973; Sun, 27 Feb 1994 02:14:44 -0800
Received: from localhost by netcom8.netcom.com (8.6.4/SMI-4.1/Netcom)
	id CAA25684; Sun, 27 Feb 1994 02:15:29 -0800
Date: Sun, 27 Feb 1994 02:15:29 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199402271015.CAA25684@netcom8.netcom.com>
To: kinnear@adapt.com
Cc: conor@ravenloft.ucc.ie, p00396@psilink.com,
        genetic-programming@cs.stanford.edu
In-Reply-To: <9402231620.AA21884@adapt.com> (kinnear@adapt.com)
Subject: Re: Network parallelism.
Status: RO

I dunno about "astute", but i think that diversity is a *real*
key issue.  my limited expeience sez that the less mixing the bettter.
just MHO.
-wt

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 15:08:38 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA16839
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 14:18:24 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA04370 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 11:34:39 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from bay.cc.kcl.ac.uk by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA21046; Fri, 25 Feb 1994 11:33:30 -0800
Received: by bay.cc.kcl.ac.uk (MX V3.3 VAX) id 22550; Fri, 25 Feb 1994 19:35:53
          EST
Date: Fri, 25 Feb 1994 19:35:41 EST
From: udue074@bay.cc.kcl.ac.uk
To: genetic-programming@cs.stanford.edu
Message-Id: <0097A99D.F7B1E6E0.22550@bay.cc.kcl.ac.uk>
Subject: Re: Diversity and sexiness in GA/GP
Status: RO


>I have an idea for a somewhat related scheme that would, I think, work
>for GP as well.  It requires that the fitness test involve a number of
>"test cases", where fitness is the sum of an individual's success over
>these cases.

>Measure fitness for each individual for each test case.  More fit
>individuals are more likely to breed, as usual, but you only select
>one individual to breed, and that individual chooses it's mate.  The
>mate is chosen, either deterministically or stochastically, according
>to "sexiness", where:
>             __
>             \
>sexiness  =   >  success ( potential-mate ) - success (self)
>             /_
>	      n
>
>and n is the number of cases in the fitness test.
>
>
>In other words, an individual is sexy to you to the extent that it
>does better than you on the test cases.  The formula might be changed
>to sum-of-squares to increase the relative sexiness of individuals
>that vastly outperform you in some areas.

I would like to specify the idea of sexiness in the following way:
"an individual is sexy to you to the extent that it does better than
you" on the part of the cases on which you are especially poor. 
In this way a complementarity of phenotype is achieved with
the hope to propagate it to the children in the genotypic level.

This method is successfully used in the Finate State Automaton 
identification problem by Vanyo Slavov (vslavov@inf.nbu.bg). 
As far as I know he uses the principle of sexiness not only
to choose mates, but also to form a multi-cellar creaturte with
differentiated cells. (Where the cells are complementary expressed
in a sense that each cell solves a part of the problem, but the union
of the solutions of the different cells gives approximate solution
to the whole problem.) A possible fitness function can give credits
if the union covers larger part of the whole solution, and 
possibly to punish intersections. 
                                  
George Bilchev

/ -----------------------------------------------------------------\
|  George@inf.nbu.bg             | New Bulgarian University        |
|                                | Dept. of Computer Science       |
|  G.Biltchev@bay.cc.kcl.ac.uk   | Sofia 1125                      |
|    (valid until 25 March)      | Bulgaria                        |
\------------------------------------------------------------------/

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 15:07:39 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA16192
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 14:03:16 -0600
Received: from BAY.CC.KCL.AC.UK (bay.cc.kcl.ac.uk [137.73.2.11]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA04303 for <genetic-programming@list.stanford.edu>; Fri, 25 Feb 1994 11:19:32 -0800
From: udue074@bay.cc.kcl.ac.uk
Errors-To: mail-errors@list.Stanford.EDU
Received: by bay.cc.kcl.ac.uk (MX V3.3 VAX) id 22502; Fri, 25 Feb 1994 19:20:54
          EST
Date: Fri, 25 Feb 1994 19:20:43 EST
To: genetic-programming@list.Stanford.EDU
Message-Id: <0097A99B.E088DD40.22502@bay.cc.kcl.ac.uk>
Subject: Re: Diversity and sexiness in GA/GP
Status: RO


>I have an idea for a somewhat related scheme that would, I think, work
>for GP as well.  It requires that the fitness test involve a number of
>"test cases", where fitness is the sum of an individual's success over
>these cases.

>Measure fitness for each individual for each test case.  More fit
>individuals are more likely to breed, as usual, but you only select
>one individual to breed, and that individual chooses it's mate.  The
>mate is chosen, either deterministically or stochastically, according
>to "sexiness", where:
>             __
>             \
>sexiness  =   >  success ( potential-mate ) - success (self)
>             /_
>	      n
>
>and n is the number of cases in the fitness test.
>
>
>In other words, an individual is sexy to you to the extent that it
>does better than you on the test cases.  The formula might be changed
>to sum-of-squares to increase the relative sexiness of individuals
>that vastly outperform you in some areas.

I would like to specify the idea of sexiness in the following way:
"an individual is sexy to you to the extent that it does better than
you" on the part of the cases on which you are especially poor. 
In this way a complementarity of phenotype is achieved with
the hope to propagate it to the children in the genotypic level.

This method is successfully used in the Finate State Automaton 
identification problem by Vanyo Slavov (vslavov@inf.nbu.bg). 
As far as I know he uses the principle of sexiness not only
to choose mates, but also to form a multi-cellar creaturte with
differentiated cells. (Where the cells are complementary expressed
in a sense that each cell solves a part of the problem, but the union
of the solutions of the different cells gives approximate solution
to the whole problem.) A possible fitness function can give credits
if the union covers larger part of the whole solution, and 
possibly to punish intersections. 
                                  
George Bilchev

/ -----------------------------------------------------------------\
|  George@inf.nbu.bg             | New Bulgarian University        |
|                                | Dept. of Computer Science       |
|  G.Biltchev@bay.cc.kcl.ac.uk   | Sofia 1125                      |
|    (valid until 25 March)      | Bulgaria                        |
\------------------------------------------------------------------/

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 15:26:57 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA19715
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 15:26:54 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id MAA04747 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 12:37:05 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA24071; Fri, 25 Feb 1994 12:36:01 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA16464; Fri, 25 Feb 94 12:35:49 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA19841; Fri, 25 Feb 94 12:35:47 PST
Date: Fri, 25 Feb 94 12:35:47 PST
Message-Id: <9402252035.AA19841@hume.CS.ORST.EDU>
To: dfaulkne@lightstream.com, hthies@willamette.edu,
        tadepall@chert.CS.ORST.EDU
Cc: genetic-programming@cs.stanford.edu, dfaulkne@lightstream.com
In-Reply-To: Dave Faulkner's message of Fri, 25 Feb 1994 13:56:57 -0500 <9402251856.AA09474@cockatrice.LightStream.COM>
Subject: Diversity and sexiness in GA/GP 
Status: RO

 > Date: Fri, 25 Feb 1994 13:56:57 -0500
 > From: Dave Faulkner <dfaulkne@LightStream.COM>
 > 
 > Thank you for your thoughts on this matter.  These are interesting metrics to
 > be sure, but I don't think that they address the issue of mainaining *diversity*
 > in a population based on uniqueness of genotype.  Fitness is twice removed from
 > genotypic representation, as the genotype is mapped into phenotype (problem
 > domain) and the phenotype is then mapped to fitness (expression of an individual
 > in the problem ecology).  This distance from the original representation 
 > ...
 >
 > Think of a fitness surface with two hills.  You want equal numbers of beings at
 > each hill site if they are equal in height.  With the metrics you propose, there
 > is no difference in the metric if a being is at the same height for either hill,
 > so there is no bias toward picking the guy (gal) not on your being's hill 
 > to mate with. Its only by looking at the genotypic (and some would say that
 > phenotypic is preferred) representation that you know that the other guy(gal)
 > is not on your being's hill and so to maintain diversity you should pick
 > the guy(gal) on the other hill to mate.

I'm not so sure.  Under the sexiness scheme, you would only be "on the
same hill" as me if your success chart over all of the test cases
looked the same as mine.  I'd expect that the technique would focus on
"important" differences, where trying to maintain raw genetic
diversity might waste "effort" keeping introns diverse.

 > In GP, the problem is understanding nearness in the genotypic or phenotypic
 > representations, I would guess.  Hamming distance calculations take 
 > advantage of the consistency of a parameter string: position 3 always
 > means "the x value" in some function, has a range from 1..-1, etc.
 > GP s-expressions have no such consistency, and no obvious cannonical
 > form to facilitate comparisons between s-expressions; thus no easy
 > metric.

Another point for me.  :-)

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\   "We feel confident that there is a 4-line LISP hack for consciousness."   /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 15:29:46 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA17362
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 14:30:54 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA04413 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 11:44:57 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from mail.netcom.com (netcom2.netcom.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA21448; Fri, 25 Feb 1994 11:43:53 -0800
Received: from localhost by mail.netcom.com (8.6.4/SMI-4.1/Netcom)
	id LAA16818; Fri, 25 Feb 1994 11:44:43 -0800
Date: Fri, 25 Feb 1994 11:44:43 -0800
From: peb@netcom.com (Paul E. Baclace)
Message-Id: <199402251944.LAA16818@mail.netcom.com>
To: genetic-programming@cs.stanford.edu, schenk@cs.bris.ac.uk
Subject: PC Scheme, etc
Status: RO

Since I ran a project at Autodesk to build a Scheme environment, I have
some answers to your Scheme questions...

1. Try ELK, created by Oliver Laumann.  This is a fast and robust interpreter.
  Scheme2C from HP is probably faster, but it lacks call/cc, bignums amoung
  other things.

2. call/cc duplicates the stack in the heap whenever it is called.  Invoking
  the continuation replaces the current stack with the saved one.  The 
  old stack is garbage collected if it is unbound (it could be bound by 
  another call/cc, for instance).
  Call/cc thus introduces a GC problem.  In the Autodesk version of Scheme
  I added call/cc-lite which was simply a setjmp (aside from the massive 
  stack unwind interactions with the symbol tables...) so it did not 
  introduce a GC problem and was much faster.  (Sorry you can't use the
  program due to a management takeover that went back to the "core business"
  thus switching from Exploration to Exploitation.)

3. There must be another binding lurking around to keep the list around.
  Either that or you have a buggy interpreter.

Paul E. Baclace
peb@netcom.com

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 15:08:42 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA17366
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 14:31:09 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA04106 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 10:42:56 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from dynamo.ecn.purdue.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA18819; Fri, 25 Feb 1994 10:41:51 -0800
Received: from msesmac11.ecn.purdue.edu by dynamo.ecn.purdue.edu (5.65/1.32jrs)
	id AA25716; Fri, 25 Feb 94 13:41:48 -0500
Message-Id: <9402251841.AA25716@dynamo.ecn.purdue.edu>
Date: Fri, 25 Feb 1994 13:41:49 -0600
To: genetic-programming@cs.stanford.edu
From: tenorio@ECN.PURDUE.EDU
X-Sender: tenorio@dynamo.ecn.purdue.edu
Subject: Re: Diversity and sexiness in GA/GP
Status: RO

>>In other words, an individual is sexy to you to the extent that it
>>does better than you on the test cases.  The formula might be changed
>>to sum-of-squares to increase the relative sexiness of individuals
>>that vastly outperform you in some areas.
>
>This is an interesting idea. The way I look at it is that your looking at
>a "fitness-profile" instead of just a single value fitness.
>
>However, I think it makes more sense to me for "sexiness" to increase
>based on the opposites attract premise. That is, an individual will
>desire others whos profile is good but UNIQUE from his own.
>
>So if you have 3 individuals A, B, and C where A is looking for a
>mate. Even if B's raw fitness is better than C's, A will desire C if
>C's profile is unique enough - hey I would rather kiss a stranger than
>my sister even if my sister is more attractive:
>

Shouldn't this be better summarized by saying that the most complementary
form of the problem is the best partner? This is akin to a combination of
low correlation among parents as well as high performance in the penalty
function. The problem is that this describes a two part function to be
minimize and that has all the classical problems of such multiobjective
minimization. If you require high fitness, the correlation term gets
neglected and you spin your wheels (focused search, low diversity). If you
require uncorrelation to be higher, diversity certainly increases and so
does the probability of a good solution at the expense of a more unfocused
search.

Sho Kuwamoto (sho@physics.purdue.edu) has explored with a number of
different forms of the correlation x penalty terms in the design of the
SONN (previous posting) with interesting results that where superior to not
having the correlation term in the penalty function. This is an interesting
problem for  submodel selection.


Cheers.

--ft.


< Manoel Fernando Tenorio                             >
< (tenorio@ecn.purdue.edu) or (..!pur-ee!tenorio)     >
< MSEE233D                                            >
< Parallel Distributed Structures Laboratory          >
< School of Electrical Engineering                    >
< Purdue University                                   >
< W. Lafayette, IN, 47907-1285                        >
< Phone: 317-494-3482 Fax: 317-494-6440               >

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 13:49:54 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15563
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 13:49:50 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA04186 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 10:58:09 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from lightstream.LightStream.COM (lightstream.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA19320; Fri, 25 Feb 1994 10:57:04 -0800
Received: from cockatrice.LightStream.COM by lightstream.LightStream.COM (4.1/SMI-4.1)
	id AA28028; Fri, 25 Feb 94 13:57:00 EST
Received: by cockatrice.LightStream.COM (4.1/SMI-4.1)
	id AA09474; Fri, 25 Feb 94 13:56:58 EST
Message-Id: <9402251856.AA09474@cockatrice.LightStream.COM>
To: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Cc: genetic-programming@cs.stanford.edu, dfaulkne@LightStream.COM
Subject: Re: Diversity and sexiness in GA/GP 
In-Reply-To: Your message of "Thu, 24 Feb 1994 17:25:04 PST."
             <9402250125.AA19628@hume.CS.ORST.EDU> 
Date: Fri, 25 Feb 1994 13:56:57 -0500
From: Dave Faulkner <dfaulkne@LightStream.COM>
Status: RO


Peter Dudey Writes:

	In other words, an individual is sexy to you to the extent that it
	does better than you on the test cases.  The formula might be changed
	to sum-of-squares to increase the relative sexiness of individuals
	that vastly outperform you in some areas.

...and Mike Keith chips in with:

	However, I think it makes more sense to me for "sexiness" to increase
	based on the opposites attract premise. That is, an individual will
	desire others whos profile is good but UNIQUE from his own.
	....

	Sexiness = C1*rawFitness + C2*Uniqueness

	Where uniqueness is based on the sum of squares of differences over the
	test case profile as Peter mentioned. The second term above will take more
	effect later in the run.

Thank you for your thoughts on this matter.  These are interesting metrics to
be sure, but I don't think that they address the issue of mainaining *diversity*
in a population based on uniqueness of genotype.  Fitness is twice removed from
genotypic representation, as the genotype is mapped into phenotype (problem
domain) and the phenotype is then mapped to fitness (expression of an individual
in the problem ecology).  This distance from the original representation 
makes it difficult to maintain diversity in the population because the uniqueness
of the individual is no longer represented in the fitness, regardless of how
many cases are examined.

This will lead, I believe, in faster convergence of the population toward a
solution, but will not necessarily lead to a diversified population that is
able to quickly adapt later to changing conditions in the environment. (Thus the
computation will run out of "steam").

Think of a fitness surface with two hills.  You want equal numbers of beings at
each hill site if they are equal in height.  With the metrics you propose, there
is no difference in the metric if a being is at the same height for either hill,
so there is no bias toward picking the guy (gal) not on your being's hill 
to mate with. Its only by looking at the genotypic (and some would say that
phenotypic is preferred) representation that you know that the other guy(gal)
is not on your being's hill and so to maintain diversity you should pick
the guy(gal) on the other hill to mate.

In GP, the problem is understanding nearness in the genotypic or phenotypic
representations, I would guess.  Hamming distance calculations take 
advantage of the consistency of a parameter string: position 3 always
means "the x value" in some function, has a range from 1..-1, etc.
GP s-expressions have no such consistency, and no obvious cannonical
form to facilitate comparisons between s-expressions; thus no easy
metric.

Collins ("Studies in Artificial Evolution") and others talk about
using "demes" to maintain diversity.  This is an interesting idea.  Rather
than measuring distances between individuals, you superimpose an artificial
distance that ignores representation, and the sustaining of that distance
bias generation after generation creates isolated niches that are
allowed to maintain uniqueness within the population due to this 
stochastic barrier prohibiting genetic material swapping, thus creating
"islands" of relatively fit, unique individuals. (Think of an island
of birds where on rare occasion, one gets blown over to another island
to affect the gene pool of the other island bird population).

The problem I have with this approach is that this diversity isn't
really garaunteed: its an effect of an initial randomness of the population
and the isolation of the islands.  Given enough  time, can we be sure
that the majority of islands will not converge on the same hill,
and eventaully that all islands will converge to the same hill?
This seems like a logical conclusion; demes create divergence, 
but only for a while (that time being extended by the size of islands
and population exchange rate).  I can well see how demes will slow convergence
for a very long time.  In nature, I believe diversity is a function
of the environment as well as the population dynamics (e.g., what
if each deme has a slightly different fitness function), and
unlike most of our toy models, fitness functions are not time-invariant
(evolution by subsumption (?) competition excluded).

Of course, tying genotypic representation to fitness modulation has
many problems as well: it must be centralized, it is expensive,
it is representation specific, etc. It also seems very hard for GP
(at least for me).

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 12:43:07 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA12717
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 12:43:03 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA03692 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 09:32:20 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15899; Fri, 25 Feb 1994 09:31:14 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA15159; Fri, 25 Feb 94 09:31:11 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA19754; Fri, 25 Feb 94 09:31:10 PST
Date: Fri, 25 Feb 94 09:31:10 PST
Message-Id: <9402251731.AA19754@hume.CS.ORST.EDU>
To: keithm@icd.ab.com
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: "Mike J. Keith"'s message of Fri, 25 Feb 1994 12:13:44 -0500 <199402251713.MAA19047@odin.icd.ab.com>
Subject: Diversity and sexiness in GA/GP
Status: RO

 > Date: Fri, 25 Feb 1994 12:13:44 -0500
 > From: "Mike J. Keith" <keithm@icd.ab.com>
 > 
 > Peter responded:
 > 
 > >Wouldn't this make a successful twin of yourself and a horribly unfit
 > >mutant appear equally sexy?  Wouldn't that be a bad thing?
 > 
 > A successful twin of yourself would have a low uniqueness causing the 
 > 2nd term to be low. An unfit mutant would have a low first term. Both terms
 > will only be high when the potential mate has a high default fitness and
 > also has a unique fitness profile with respect to the choser.
 > 
 > The problem with using the strait diff or a cube of the diff is that the
 > differences between 2 individuals can cancel out. So if most of the population

Hmmm.  Good point.  Still, it appears that your scheme rewards
inferiority at some test case (although it would punish inferiority at
many).  How about this?

Sexiness = SUM  superiority ( potential-mate, self, test_case(i) )
            i

Where superiority is:

If the potential mate does better than you, the square of the diff of
your fitnesses, otherwise minus the diff of your fitnesses.

For example:

Test Case	1	2	3	4	5

Self		3	7	3	8	2
P. Mate		8	2	9	3	1

superiority    25      -5      36      -5      -1

resulting in a sexiness of 50.

If squaring is too potent, a smaller power might be in order, as might
some normalization.  Note that both of the schemes I have offered
allow for negative sexiness, and an individual looking at a copy of
itself will find it to have sexiness 0.

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\   "We feel confident that there is a 4-line LISP hack for consciousness."   /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 12:14:36 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA11439
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 12:14:33 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA03578 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 09:14:58 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from odin.icd.ab.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15222; Fri, 25 Feb 1994 09:13:50 -0800
Received: from gadwal.icd.ab.com (gadwal.icd.ab.com [130.151.132.71]) by odin.icd.ab.com (8.1C/5.6) with SMTP id MAA19047; Fri, 25 Feb 1994 12:13:44 -0500
Date: Fri, 25 Feb 1994 12:13:44 -0500
From: "Mike J. Keith" <keithm@icd.ab.com>
Message-Id: <199402251713.MAA19047@odin.icd.ab.com>
To: dudeyp@chert.CS.ORST.EDU
Subject: Re: Diversity and sexiness in GA/GP
Cc: genetic-programming@cs.stanford.edu
Status: RO

I offerred:

> > Sexiness = C1*rawFitness + C2*Uniqueness
> > 
> > Where uniqueness is based on the sum of squares of differences over the
> > test case profile as Peter mentioned. The second term above will take more
> > effect later in the run.

Peter responded:

>Wouldn't this make a successful twin of yourself and a horribly unfit
>mutant appear equally sexy?  Wouldn't that be a bad thing?

A successful twin of yourself would have a low uniqueness causing the 
2nd term to be low. An unfit mutant would have a low first term. Both terms
will only be high when the potential mate has a high default fitness and
also has a unique fitness profile with respect to the choser.

You can obviously adjust C1 and C2 (perhaps even dynamically) to control
how important you want uniqueness to be with respect to raw fitness (note
that raw fitness is just your normal error sum).

The problem with using the strait diff or a cube of the diff is that the
differences between 2 individuals can cancel out. So if most of the population
is able to solve half of the test cases very well and along comes an
individual who is a bit worse at the known half but a bit better at the other 
cases. By using the sum of the squares in the 2 term formula above, this 
individual would be very sexy which is what I would think your after.

If you use just the diff or the cube, you can only find individuals who
are better overall which normal GA or GP does anyway - looking at the
profile then doesn't really buy you anything ......does it ??


Mike

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 11:46:12 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10204
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 11:46:10 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id IAA03414 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 08:47:57 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA14359; Fri, 25 Feb 1994 08:46:53 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA14900; Fri, 25 Feb 94 08:46:49 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA19737; Fri, 25 Feb 94 08:46:49 PST
Date: Fri, 25 Feb 94 08:46:49 PST
Message-Id: <9402251646.AA19737@hume.CS.ORST.EDU>
To: rothfusza@osprey.nwrc.gov
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: rothfusza@osprey.nwrc.gov's message of Fri, 25 Feb 94 08:53:30 cst <9401257621.AA762195210@osprey.nwrc.gov>
Subject: Diversity and sexiness in GA/GP 
Status: RO

 > Date: Fri, 25 Feb 94 08:53:30 cst
 > From: rothfusza@osprey.nwrc.gov
 > 
 >     Hello Peter:
 > 
 >     Your "sexiness" posting to the GP list sounds like a generalization of
 >     my own selection algorithm.  I do not completely understand your
 >     summation, however.  You seem to indicate that for each fitness test,
 >     you calculate the success of each possible mate and the success of the
 >     "self", take the difference, and accumulate this value over n tests.
 >     Does this mean that "self" gets tested n times against m possible
 >     mates?  Does each possible mate repeat the process?  What defines a
 >     possible mate?

Each individual is tested exactly once on each of the fitness cases.
These are just problems in the domain that the individuals are
supposed to be good at.  For example, if one is breeding polynomials
to fit a curve, the fitness cases might be the points on the curve.
These are stored so that they can be re-used.  The idea is that if I
closely fit the first two-thirds of the data points, I want to breed
with someone who does well in general, and specifically on the last
one-third.

I don't have any demes or the like in mind, so everyone is a potential
mate.

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\   "We feel confident that there is a 4-line LISP hack for consciousness."   /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 12:04:47 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10062
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 11:43:08 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id IAA03367 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 08:38:33 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13925; Fri, 25 Feb 1994 08:37:29 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA14853; Fri, 25 Feb 94 08:37:17 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA19734; Fri, 25 Feb 94 08:37:16 PST
Date: Fri, 25 Feb 94 08:37:16 PST
Message-Id: <9402251637.AA19734@hume.CS.ORST.EDU>
To: keithm@icd.ab.com
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: "Mike J. Keith"'s message of Fri, 25 Feb 1994 09:01:55 -0500 <199402251401.JAA15298@odin.icd.ab.com>
Subject: Diversity and sexiness in GA/GP
Status: RO

 > Date: Fri, 25 Feb 1994 09:01:55 -0500
 > From: "Mike J. Keith" <keithm@icd.ab.com>
 > 
 > However, I think it makes more sense to me for "sexiness" to increase
 > based on the opposites attract premise. That is, an individual will
 > desire others whos profile is good but UNIQUE from his own.
 > 
 > So if you have 3 individuals A, B, and C where A is looking for a
 > mate. Even if B's raw fitness is better than C's, A will desire C if
 > C's profile is unique enough - hey I would rather kiss a stranger than
 > my sister even if my sister is more attractive:
 > 
 > Sexiness = C1*rawFitness + C2*Uniqueness
 > 
 > Where uniqueness is based on the sum of squares of differences over the
 > test case profile as Peter mentioned. The second term above will take more
 > effect later in the run.

Wouldn't this make a successful twin of yourself and a horribly unfit
mutant appear equally sexy?  Wouldn't that be a bad thing?

I figured one should choose a mate that did /well/ on problems that
the chooser found difficult, not just one that had /different/ scores.
Hmmm.  I suppose, then, that under my original model, sum-of-cubes
might be better than sum-of-squares, because squares would only
emphasize difference.

It is worth pointing out that this technique only insures phenotypic,
rather than genotypic, diversity.

/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\   "We feel confident that there is a 4-line LISP hack for consciousness."   /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 25 09:04:33 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA03042
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 25 Feb 1994 09:04:30 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id GAA02634 for <Genetic-Programming@list.stanford.edu>; Fri, 25 Feb 1994 06:03:22 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from odin.icd.ab.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA09300; Fri, 25 Feb 1994 06:02:12 -0800
Received: from gadwal.icd.ab.com (gadwal.icd.ab.com [130.151.132.71]) by odin.icd.ab.com (8.1C/5.6) with SMTP id JAA15298; Fri, 25 Feb 1994 09:01:55 -0500
Date: Fri, 25 Feb 1994 09:01:55 -0500
From: "Mike J. Keith" <keithm@icd.ab.com>
Message-Id: <199402251401.JAA15298@odin.icd.ab.com>
To: dudeyp@chert.CS.ORST.EDU
Subject: Re: Diversity and sexiness in GA/GP
Cc: genetic-programming@cs.stanford.edu
Status: RO

>In other words, an individual is sexy to you to the extent that it
>does better than you on the test cases.  The formula might be changed
>to sum-of-squares to increase the relative sexiness of individuals
>that vastly outperform you in some areas.

This is an interesting idea. The way I look at it is that your looking at
a "fitness-profile" instead of just a single value fitness.

However, I think it makes more sense to me for "sexiness" to increase
based on the opposites attract premise. That is, an individual will
desire others whos profile is good but UNIQUE from his own.

So if you have 3 individuals A, B, and C where A is looking for a
mate. Even if B's raw fitness is better than C's, A will desire C if
C's profile is unique enough - hey I would rather kiss a stranger than
my sister even if my sister is more attractive:

Sexiness = C1*rawFitness + C2*Uniqueness

Where uniqueness is based on the sum of squares of differences over the
test case profile as Peter mentioned. The second term above will take more
effect later in the run.

>My prediction is that this will increase
>diversity, and possibly provide additional benefits by bringing
>together mutually beneficial subsolutions.

Yeah it seems like thats what the fitness-vector or profile concept
could buy you.

Mike

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 24 20:13:40 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA24339
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 24 Feb 1994 20:13:36 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id RAA29399 for <Genetic-Programming@list.stanford.edu>; Thu, 24 Feb 1994 17:26:32 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from research.CS.ORST.EDU (chert.CS.ORST.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA22687; Thu, 24 Feb 1994 17:25:21 -0800
Received: from hume.CS.ORST.EDU by research.CS.ORST.EDU (4.1/1.30)
	id AA05325; Thu, 24 Feb 94 17:25:05 PST
From: dudeyp@chert.CS.ORST.EDU (Peter Dudey)
Received: by hume.CS.ORST.EDU (4.1/CS-Client)
	id AA19628; Thu, 24 Feb 94 17:25:04 PST
Date: Thu, 24 Feb 94 17:25:04 PST
Message-Id: <9402250125.AA19628@hume.CS.ORST.EDU>
To: dfaulkne@lightstream.com, tadepall@chert.CS.ORST.EDU
Cc: kinnear@adapt.com, genetic-programming@cs.stanford.edu,
        dfaulkne@lightstream.com
In-Reply-To: Dave Faulkner's message of Thu, 24 Feb 1994 19:27:03 -0500 <9402250027.AA08175@cockatrice.LightStream.COM>
Subject: Diversity and sexiness in GA/GP 
Status: RO

 > Date: Thu, 24 Feb 1994 19:27:03 -0500
 > From: Dave Faulkner <dfaulkne@LightStream.COM>
 > 
 > I have had some success in maintaining diversity in GA populations by using a
 > fitness suppression technique.  Briefly, I measure the normalized sum of the
 > Hamming Distances between a member of the population and all other members of 
 > the population.  This number can be used to determine how "near" a member is
 > on average to the other members of the population, and so can be used to
 > suppress a fitness value (say, by multiplying this number by the fitness value
 > and then later re-normalizing the population's fitness values).  This tends
 > to suppress teh modified fitness value of clusters of beings so that good,
 > lonesome beings that pop up have a good chance of multiplying within the
 > population before dying off (elitism also helps here).  This technique is
 > mentioned briefly in Goldberg's book ("Genetic Algorithms in Optimization...").

I have an idea for a somewhat related scheme that would, I think, work
for GP as well.  It requires that the fitness test involve a number of
"test cases", where fitness is the sum of an individual's success over
these cases.

Measure fitness for each individual for each test case.  More fit
individuals are more likely to breed, as usual, but you only select
one individual to breed, and that individual chooses it's mate.  The
mate is chosen, either deterministically or stochastically, according
to "sexiness", where:
             __
             \
sexiness  =   >  success ( potential-mate ) - success (self)
             /_
	      n

and n is the number of cases in the fitness test.


In other words, an individual is sexy to you to the extent that it
does better than you on the test cases.  The formula might be changed
to sum-of-squares to increase the relative sexiness of individuals
that vastly outperform you in some areas.

I plan to examine this in my Master's thesis (most of which will be
done this summer).  My prediction is that this will increase
diversity, and possibly provide additional benefits by bringing
together mutually beneficial subsolutions.  (Prediction?  Testing?
This must be science!  :-)  )

Discussion, pointer to literature?  Is this a (social or genetic)
factor in biological sexual attraction?

Thanks,
/~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\
\ Peter Dudey, MS student in Artificial Intelligence, Oregon State University /
/ dudeyp@research.cs.orst.edu : hagbard on IGS : 257 NE 13th, Salem, OR 97301 \
\   "We feel confident that there is a 4-line LISP hack for consciousness."   /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 24 19:28:46 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA22633
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 24 Feb 1994 19:28:41 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id QAA28985 for <Genetic-Programming@list.stanford.edu>; Thu, 24 Feb 1994 16:28:14 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from lightstream.LightStream.COM (lightstream.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA19460; Thu, 24 Feb 1994 16:27:09 -0800
Received: from cockatrice.LightStream.COM by lightstream.LightStream.COM (4.1/SMI-4.1)
	id AA11206; Thu, 24 Feb 94 19:27:06 EST
Received: by cockatrice.LightStream.COM (4.1/SMI-4.1)
	id AA08175; Thu, 24 Feb 94 19:27:04 EST
Message-Id: <9402250027.AA08175@cockatrice.LightStream.COM>
To: kinnear@adapt.com
Cc: genetic-programming@CS.STANFORD.EDU, dfaulkne@LightStream.COM
Subject: Re: Network parallelism. 
In-Reply-To: Your message of "Wed, 23 Feb 1994 11:20:37 EST."
             <9402231620.AA21884@adapt.com> 
Date: Thu, 24 Feb 1994 19:27:03 -0500
From: Dave Faulkner <dfaulkne@LightStream.COM>
Status: RO


Re: running out of steam; paper by H. Iba::

I ask the following questions in the public forum because others may be
asking the same questions:

	What is MDL based fitness functions, and how will this maintain
	diversity in the population?

	How can we e-mail Morgan Kaufman to order a copy of the ICGA-5 '93
	book (which seems to contain many interesting papers)?

Re: "steam" and fitness function modulation::

I have had some success in maintaining diversity in GA populations by using a
fitness suppression technique.  Briefly, I measure the normalized sum of the
Hamming Distances between a member of the population and all other members of 
the population.  This number can be used to determine how "near" a member is
on average to the other members of the population, and so can be used to
suppress a fitness value (say, by multiplying this number by the fitness value
and then later re-normalizing the population's fitness values).  This tends
to suppress teh modified fitness value of clusters of beings so that good,
lonesome beings that pop up have a good chance of multiplying within the
population before dying off (elitism also helps here).  This technique is
mentioned briefly in Goldberg's book ("Genetic Algorithms in Optimization...").

Has anyone else used similar techniques of fitness modulated by genotype
"nearness"?

In the GP world, my guess is that this is very difficult because it isn't easy to
create nearness metrics on the s-expressions.  Even assumming that a canonical
s-expression can be created, now do we measure differences in structure? Has
anyone thought about such metrics or even their value in such a context?  Does a
stack representation help here?

Thinking GP when the chance arises -- Dave Faulkner.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 24 18:06:01 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA17847
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 24 Feb 1994 17:23:28 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id OAA28041 for <Genetic-Programming@list.stanford.edu>; Thu, 24 Feb 1994 14:18:50 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from dynamo.ecn.purdue.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13237; Thu, 24 Feb 1994 14:17:46 -0800
Received: from msesmac11.ecn.purdue.edu by dynamo.ecn.purdue.edu (5.65/1.32jrs)
	id AA24469; Thu, 24 Feb 94 17:17:35 -0500
Message-Id: <9402242217.AA24469@dynamo.ecn.purdue.edu>
Date: Thu, 24 Feb 1994 17:17:38 -0600
To: <kinnear@adapt.com>
From: tenorio@ecn.purdue.edu
X-Sender: tenorio@dynamo.ecn.purdue.edu
Subject: Re: Network parallelism.
Cc: genetic-programming@cs.stanford.edu
Status: RO

>> From thehulk!icd.ab.com!keithm Thu Feb 24 14:42:47 1994
>> Date: Thu, 24 Feb 1994 13:36:38 -0500
>> From: "Mike J. Keith" <keithm@icd.ab.com>
>> To: kinnear@adapt.com
>> Subject: Re: Network parallelism.
>> Cc: genetic-programming@cs.stanford.edu
>> Content-Length: 1255
>> 
>> [...]
>> 
>> Note I gathered that the GMDH algorithm seems to only be appliable to math
>> polynomial expressions ?? It also seems to make the GA go slower but
>> that may be a good tradeoff based on the fact that you seem to get
>> long runs.
>> 
>> Mike
>> 
>
>        I'm going to get in over my head here pretty rapidly, since I'm
>        *not* an expert on GMDH. I'm interested in the work that Iba
>        et. al. have done with it, but I haven't tried it myself.
>
>        FYI, *have* seen work from Iba et. al. where they extended it
>        to learning boolean functions.  It was in their technical
>        report ETL-TR-94-2 "System Identification Approach to Genetic
>        Programming", which is only available by snail mail from the
>        author(s): iba@etl.go.jp , kurita@etl.go.jp , sato@etl.go.jp.
>        In it they solved the 5 parity problem, the 6 mux problem, and
>        the "two box" problem, among others.
>
>  
>        Cheers -- Kim

I am new to this list so forgive me if this posting seems redundant. I have
done some work with GMDH and extended it in several ways. First, I removed
the restriction of the functions in the node, they could come from an
arbitrary set of prototype functions. Second, the error criterium included
an MDL term. Third, the connectivity was made arbitrary in both order and
in form (we even later extended to recursions), finally we used a
modification of simulated annealing to generate the connection graph.

The resulting algorithm at first was very slow. It took us back in 1987 on
a gould supermini 2 week to run a Mackey-Glass time series prediction
problem. Later, John Kassebaum noted that some programming tricks and
clever redefinition of the problem we could bring the network to produce
between 10 up to 200 nodes per second. Today we run a modified version of
the algorithm on a macII class machine in a matter of minutes. The
reference to this work in in IEEE Trans. on NN vol 1 no. 1 1990, pg 100.,
Self Organizing Network for System Identification. 

A few things from our experience are pertinent to the posting. There is no
restriction on the types of functions used (put a tanh and make NN, etc...)
the problems in domains such as boolean problems have more to do with the
form of the error function than anything else. Second, we found that there
could be something akin to MDL for computer time as well, meaning, the
number of very high nodes may or may not significantly add to your
performance, most adequate performance happened with a few nodes, without
significant improvement (certainly not if the time required was taken into
consideration). Third, the higher the number of nodes, the lower (apart
from exhaustive search) the probability of finding the optimum combination.
On this point, there exists for any problem, a whole class of models that
satisfy the penalty function within an epsilon, so the point may be mute.
The problem with the polynomial class of GMDH algorithms (SONN - polynomial
function included) is that the algorithm can only generate a subset of all
possible functions. This has been shown very eloquently by John Kassebaum
in his master thesis. Our group, in the Parallel Distributed Structures Lab
at Purdue, was intrigued with the capabilities of such nets in 87, and
since we wanted to show parallels with the universality of NN. It turns out
that using function of functions (GMDH) does not allow to compute even all
possible polynomials, specially when the connectivity is restricted as per
the original algorithm. Therefore, inspite of appealling, this form of
trade off of depth (global approximation) by width (local approximation) is
not really two ways. 

Finally, our group, with Rasiul Safavian, Antonio Thome and William Hsu,
have performed several experiments in pattern recognition, high dimensional
satellite data classification, MSS data classification, 2 1/2 d target
identification. All with excellent results. The models of soil erosion
reduced the computation process from minutes in a mini computer (as per the
original algorithm ) to seconds on a calculator (as per the SONN close form
model). But the disappointing fact is that this form of algorithms remain
functionally restricted, which dims my enthusiasm towards a broader
acceptability..


--ft.


< Manoel Fernando Tenorio                             >
< (tenorio@ecn.purdue.edu) or (..!pur-ee!tenorio)     >
< MSEE233D                                            >
< Parallel Distributed Structures Laboratory          >
< School of Electrical Engineering                    >
< Purdue University                                   >
< W. Lafayette, IN, 47907-1285                        >
< Phone: 317-494-3482 Fax: 317-494-6440               >

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 24 15:50:45 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA12487
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 24 Feb 1994 15:09:39 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id MAA26858 for <Genetic-Programming@list.stanford.edu>; Thu, 24 Feb 1994 12:13:18 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from dmc.com (HULK.DMC.COM) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA07266; Thu, 24 Feb 1994 12:12:08 -0800
Received: from oak by DMC.COM (MX V3.3 VAX) with UUCP; Thu, 24 Feb 1994
          15:09:11 EST
Received: by adapt.com (4.1/SMI-4.1) id AA23327; Thu, 24 Feb 94 15:02:42 EST
Date: Thu, 24 Feb 94 15:02:42 EST
From: <kinnear@adapt.com>
Message-Id: <9402242002.AA23327@adapt.com>
To: kinnear@adapt.com, keithm@icd.ab.com
Subject: Re: Network parallelism.
Cc: genetic-programming@cs.stanford.edu
Status: RO


> From thehulk!icd.ab.com!keithm Thu Feb 24 14:42:47 1994
> Date: Thu, 24 Feb 1994 13:36:38 -0500
> From: "Mike J. Keith" <keithm@icd.ab.com>
> To: kinnear@adapt.com
> Subject: Re: Network parallelism.
> Cc: genetic-programming@cs.stanford.edu
> Content-Length: 1255
> 
> [...]
> 
> Note I gathered that the GMDH algorithm seems to only be appliable to math
> polynomial expressions ?? It also seems to make the GA go slower but
> that may be a good tradeoff based on the fact that you seem to get
> long runs.
> 
> Mike
> 

	I'm going to get in over my head here pretty rapidly, since I'm
	*not* an expert on GMDH. I'm interested in the work that Iba
	et. al. have done with it, but I haven't tried it myself.

	FYI, *have* seen work from Iba et. al. where they extended it
	to learning boolean functions.  It was in their technical
	report ETL-TR-94-2 "System Identification Approach to Genetic
	Programming", which is only available by snail mail from the
	author(s): iba@etl.go.jp , kurita@etl.go.jp , sato@etl.go.jp.
	In it they solved the 5 parity problem, the 6 mux problem, and
	the "two box" problem, among others.

	-----------

	I'm impressed with the amount of machine time it would take to
	run 1740 generations, which is one of the reasons that I haven't
	tried it out (not to mention that GMDH is not a simple "drop
	in" to an existing GP system).

	Cheers -- Kim

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 24 15:35:59 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09299
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 24 Feb 1994 13:49:17 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA26142 for <Genetic-Programming@list.stanford.edu>; Thu, 24 Feb 1994 10:38:07 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from odin.icd.ab.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA02897; Thu, 24 Feb 1994 10:36:58 -0800
Received: from gadwal.icd.ab.com (gadwal.icd.ab.com [130.151.132.71]) by odin.icd.ab.com (8.1C/5.6) with SMTP id NAA14439; Thu, 24 Feb 1994 13:36:38 -0500
Date: Thu, 24 Feb 1994 13:36:38 -0500
From: "Mike J. Keith" <keithm@icd.ab.com>
Message-Id: <199402241836.NAA14439@odin.icd.ab.com>
To: kinnear@adapt.com
Subject: Re: Network parallelism.
Cc: genetic-programming@CS.Stanford.EDU
Status: RO


>However, I'm also aware of an interesting (and possibly counter)
>example to the "run out of steam" hypothesis, at least as regards
>generations. 

>What is relevant to this discussion is that their success with the time
>series experiments came after 1740 generations.  After 233 generations
>their mean square error on the testing data was .01261, and after 1740
>generations it was 5.06E-6.  Now, they also use MDL based fitness

This is very interesting. After browsing their paper it appears that
their Group Method Data Handling algorithm (GMDH) acts as a diversity 
generator allowing for "endless evolution".

Because they are relying on the GMDH to create expressions in addition
to normal GA operators (linke crossover and mutation), then your still
stuck with the question of how do I maintain diversity in a GA.

But perhaps what everyone is finding out is that with normal crossover
based GAs or GP, you will always run out of gas. If you could extend the
GMDH idea for any expressions then maybe your all set.

Note I gathered that the GMDH algorithm seems to only be appliable to math
polynomial expressions ?? It also seems to make the GA go slower but
that may be a good tradeoff based on the fact that you seem to get
long runs.

Mike

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 13:34:14 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA01778
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 13:34:10 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA17679 for <Genetic-Programming@list.stanford.edu>; Wed, 23 Feb 1994 10:44:56 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from LABS-N.BBN.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA11711; Wed, 23 Feb 1994 10:43:52 -0800
Message-Id: <199402231843.AA11711@Sunburn.Stanford.EDU>
Date:     Wed, 23 Feb 94 13:39:06 EST
From: Nichael Cramer <ncramer@BBN.COM>
To: Charles Erec Stebbins <stebbic@darst-sgi.ROCKEFELLER.EDU>
Cc: genetic-programming@cs.Stanford.EDU
Subject:  Re:  GP and SF
Status: RO

>Date: Wed, 23 Feb 94 12:26:04 -0500
>From: Charles Erec Stebbins <stebbic%darst-sgi.ROCKEFELLER.EDU@rockvax.rockefeller.edu>
>Subject: GP and SF
>
>p.s. The Sagan => Satan connection breaks down with BrOCa's Brain (sp?)
                                                       ^^
                                                       ||
>however. 

Backward Masking.  What more proof do we need?               ;-)   ;-)


Nichael no-more-of-this-nonsense-I-promise Cramer
   

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 12:30:10 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA28028
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 12:15:58 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA17112 for <Genetic-Programming@list.stanford.edu>; Wed, 23 Feb 1994 09:26:10 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from rockvax.rockefeller.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA06543; Wed, 23 Feb 1994 09:25:06 -0800
Received: from darst-sgi.rockefeller.edu by rockvax.ROCKEFELLER.EDU (5.65/1.34)
	id AA02883; Wed, 23 Feb 94 12:25:02 -0500
Received: by darst-sgi.rockefeller.edu (920330.SGI/920502.SGI)
	for @rockvax.rockefeller.edu:genetic-programming@cs.stanford.edu id AA07907; Wed, 23 Feb 94 12:26:04 -0500
Date: Wed, 23 Feb 94 12:26:04 -0500
From: stebbic@darst-sgi.ROCKEFELLER.EDU (Charles Erec Stebbins)
Message-Id: <9402231726.AA07907@darst-sgi.rockefeller.edu>
To: genetic-programming@cs.stanford.edu
Subject: GP and SF
Status: RO

>(If I recall correctly, a similar technique was employed in Carl Sagan's
>novel "Contact".)

Sagan's coding idea was even more interesting.  Several highly advanced
life forms had discovered that another ancient life form long ago
had left a message not in DNA or radio signals, but more unsettlingly
in the fundamental constants of the universe.  An example is pi, which
has a coded region in its decimal expansion that is a message from
this species.  It was, I believe, the great quest ofthe more advanced
life forms to discover how and why this message was given.  This may be
off a bit, it has been a while since I read the book.

Anyway, I'm not quite sure how any of this relates to GP.  It is fun,
however.

C.E. Stebbins

p.s. The Sagan => Satan connection breaks down with Broca's Brain (sp?)
however.  I'm not sure how the "CO" relates to Satan anyway, I must
confess.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 12:35:57 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA29082
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 12:35:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA17208 for <Genetic-Programming@list.stanford.edu>; Wed, 23 Feb 1994 09:47:40 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from dmc.com (HULK.DMC.COM) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA08440; Wed, 23 Feb 1994 09:46:32 -0800
Received: from oak by DMC.COM (MX V3.3 VAX) with UUCP; Wed, 23 Feb 1994
          11:39:55 EST
Received: by adapt.com (4.1/SMI-4.1) id AA21884; Wed, 23 Feb 94 11:20:37 EST
Date: Wed, 23 Feb 94 11:20:37 EST
From: <kinnear@adapt.com>
Message-Id: <9402231620.AA21884@adapt.com>
To: conor@ravenloft.ucc.ie, p00396@psilink.com
Subject: Re: Network parallelism.
Cc: genetic-programming@cs.stanford.edu
Status: RO


> Andy says:
> 
> Tackett's comment that you would be better off doing completely 
> separate runs on different workstations is astute.  This problem (that 
> you run out of steam on a big run) seems to be a weakness in crossover 
> based GP.  You can combat this tendency, however, with a better mix of 
> genetic operations.  I must also point at that at low migration rates 
> you have a continuum between a distributed population and separate 
> runs.  Some people use migration rates that are too high.
> 
> 

I tend to agree with Walter on this one, and it is my experience that
runs tend to run out of steam after a certain number of generations as
well.

However, I'm also aware of an interesting (and possibly counter)
example to the "run out of steam" hypothesis, at least as regards
generations.  Hitoshi Iba (et. al.) have done some really interesting
work using GP to optimize GMDH networks.  They have worked on
predicting some Mackey-Glass time-series, with considerable success.

What is relevant to this discussion is that their success with the time
series experiments came after 1740 generations.  After 233 generations
their mean square error on the testing data was .01261, and after 1740
generations it was 5.06E-6.  Now, they also use MDL based fitness
functions.  In addition, using GMDH the fitness of an overall node is
never worse than the fitness of its sub-nodes (which may make a big
difference).

This last property may tie in to what Andy is referring to as a weakness
of crossover based GP (or it may not...).

This gives me the sense that there is some interesting territory out
beyond 50 generations, and that GP doesn't *have* to run out of steam,
at least from a generational standpoint.

Their paper "System Identification using Structured Genetic Algorithms",
Iba, H., et. al., is in ICGA-5 '93, available from Morgan Kauffman.

Cheers -- Kim

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 10:21:52 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA22467
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 10:21:48 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id HAA16594 for <Genetic-Programming@list.stanford.edu>; Wed, 23 Feb 1994 07:30:44 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from LABS-N.BBN.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA29786; Wed, 23 Feb 1994 07:29:40 -0800
Message-Id: <199402231529.AA29786@Sunburn.Stanford.EDU>
Date:     Wed, 23 Feb 94 10:26:19 EST
From: Nichael Cramer <ncramer@BBN.COM>
To: David Andre <phred@leland.stanford.edu>
Cc: genetic-programming@cs.stanford.edu
Subject:  Re:  GP on Star Trek.....
Status: RO

>From: David Andre <phred@leland.stanford.edu>
>Subject: GP on Star Trek.....
>Date: Tue, 22 Feb 1994 22:05:18 -0800 (PST)
>
>Maybe this is old news, but I was watching 'Star Trek -- The next generation',
>and saw an episode in which Picard and the amazing computers (never seem to 
>have trouble with computer power there...) discovered a 'program' that 
>was embedded in the genetic codes of people from many different planets which
>they then 'decoded' and caused a holographic image to come about and explain
>the existence of the code and what not.  The code was placed by an
>ancient civilization in the primordial soup of many planets, and was 
>designed to bring about life similar in form to their own.  (Which explains
>why all life on Star Trek is humaniod.)  Anyway, I thought it was 
>interesting, and wondered if the writers had read about GP or GA's.  

This idea in SF is certainly much older than GP (or GA for that matter).
The basic idea of the encoding a long, detailed set of instructions[*] in
the form of a bit-stream was spelled out in detail at least as long ago as
Fred Hoyle's "Andromeda Breakthrough" series (late '50s I believe, although
in that case the signal was via radio).

  [* Instructions for building certain equipment, how to contact the
     sender, etc.]

Several people (including Asimov and, I think, Carl Sagan) followed this up
by extending the basic idea to the use of DNA encodings --virual DNA was a
common vector; they would be self-replicating, hard to wipe out, freely
spreading themselves all over the universe, etc. etc.  Martin Gardner used
exactly this ploy as the basis of a puzzle in one of his columns in an old
issue of Issac Asimov's SF Magazine: i.e. Here is the DNA sequence of a
recently discovered virus from <whereever> what is it trying to tell us...

(If I recall correctly, a similar technique was employed in Carl Sagan's
novel "Contact".)

>David Andre

Nichael
 
      "COntact" ....  "COsmic COnnection" ... "COsmos" ... "COmet"...
	    COincidence or...  Carl Sagan, famous astronomer or
		      pawn of Satan! ...  YOU decide!
   

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 09:57:23 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA21506
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 09:57:21 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id HAA16512 for <Genetic-Programming@list.stanford.edu>; Wed, 23 Feb 1994 07:10:43 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from GS61.SP.CS.CMU.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA29421; Wed, 23 Feb 1994 07:09:38 -0800
Message-Id: <199402231509.AA29421@Sunburn.Stanford.EDU>
From: Eric Teller <astro@GS61.SP.CS.CMU.EDU>
Date: Wed, 23 Feb 94 10:09:09 EST
To: genetic-programming@cs.stanford.edu
Subject: "GP, IM, the Halting Problem, etc..."
Status: RO


Hi again GP-land.

About 5 days ago I mentioned that I had a paper entitled

"Genetic Programming, Indexed Memory, the Halting Problem, and
Other Curiosities" 

that touched on some of the issues Nick brought up.

Having already received more requests then I can handle, I've
put the paper in /pub/genetic-programming/papers.

the file is called Curiosities.ps

you can get it by anonymous ftp at ftp.cc.utexas.edu

Thanks for all of the interest !

I'm eager to hear peoples reactions too...

: )

Astro Teller.

(P.s.  if you sent me the request on Thursday or Friday last week,
its comming in the mail.)

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 08:15:16 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA16952
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 08:15:14 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id FAA16090 for <Genetic-Programming@list.stanford.edu>; Wed, 23 Feb 1994 05:09:00 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from BBN.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA26323; Wed, 23 Feb 1994 05:07:55 -0800
Message-Id: <199402231307.AA26323@Sunburn.Stanford.EDU>
From: David Montana <dmontana@BBN.COM>
Subject: Re: RE: Network parallelism
To: genetic-programming@cs.stanford.edu
In-Reply-To: <9402221433.AA07526@isis.sunderland.ac.uk>
Date: Wed, 23 Feb 94 08:06:55 EST
Mail-System-Version: <BBN/MacEMail_v1.5@BBN.COM>
Status: RO

>From: cs0ral@isis.sunderland.ac.uk (r.aler)
>Subject: RE: Network parallelism
>To: genetic-programming@cs.stanford.edu
>Date: Tue, 22 Feb 1994 14:33:39 +0000 (GMT)
>Cc: cs0ral@orac.sunderland.ac.uk
>
>	I've built a parallel version for SGPC by using PVM. The configuration
>is made of a master program (where all genetic operations take place) and
>one server for every machine (where fitnesses are calculated). Individuals are
>sent from the master to the servers and servers answer with fitness.  The server
>is chosen at random. The good point with this scheme is that you can have 
>several masters at the same time using the same servers.
>That way you keep the servers pretty busy and you have the two kinds
>of parallelism at the same time (|| at individual level and || at run level).
>However I didn't get as much increase in performance as I expected because
>plenty of time is wasted in sending individuals from the master to the servers.
>Of course, the longer the fitness evaluation takes (compared to the sending
>time), the better results you the better results you obtain (compared to the
>serial version)
>
>	I've also got a bunch of different machines (different speeds).
>In this scheme the server has to wait for all fitnesses to be received 
>so the total system works at the speed of the slowest server. I've thought
>of making the selection of the server adaptive depending on the time the last
>evaluation took. This is too much simple because evaluation time depends on
>the individual, the load in the machine and the machine speed. Still thinking
>on this. If someone has ideas or has tried something similar (or different)
>tell it please.
>
>						Ricardo Aler
>						University of Sunderland (UK)
>
>

A few minor comments based on some work I've done with distributed GA's:

(1) To effectively utilize multiple-CPU machines as what might be called "fitness
servers", you should have multiple evaluations occurring at once on them.  In fact,
if n is the number of CPU's, then there should be n simultaneous evaluations (at
least on SUNs, where the ||ism of multiple CPU's can only be realized by having
multiple processes executing).

(2) The more basic issue underlying the issue of different machine speeds is that
of trying to keep all of the fitness servers as busy as possible as much as possible.
If all machines are kept fully busy all the time, then the GA is executing at
maximal speed.
The only time when the fitness servers should not be fully busy is at what I have
called the "synchronization time" at the end of a generation.  This is the time
at the end of a generation when all the individuals of the current generation are
either done evaluation or have been farmed out to a fitness server for evaluation,
and the GA is waiting to collect results before moving on to the next generation.
With generation sizes that are large and machine speeds that are at least on the
same order of magnitude, this synchronization time is small.  [Note that I use the
term generation size rather than population size because I use steady-state GA's
even when using distributed GA's; with serial GA's, I always use a generation size of
1, but with || GA's, the generation size is greater than 1 but generally less than
the population size.]
A way that I have proposed for eliminating synchronization time completely is
to completely eliminate the concept of generations.  When a fitness server becomes
free, the GA generates a new individual from its current population of evaluated
individuals and farms it off to the fitness server.

Dave

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 04:09:46 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA12017
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 04:09:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id BAA15204 for <Genetic-Programming@list.stanford.edu>; Wed, 23 Feb 1994 01:31:47 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from mailhost.lanl.gov by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA17938; Wed, 23 Feb 1994 01:30:43 -0800
Received: from t13.lanl.gov by mailhost.lanl.gov (8.6.4/1.2)
	id CAA25671; Wed, 23 Feb 1994 02:29:36 -0700
Received: from pullet.lanl.gov.t13net by t13.lanl.gov (4.1/SMI-4.1)
	id AA18401; Wed, 23 Feb 94 02:30:41 MST
Date: Wed, 23 Feb 94 02:30:41 MST
From: cgl@t13.lanl.gov (Chris Langton)
Message-Id: <9402230930.AA18401@t13.lanl.gov>
To: genetic-programming@cs.stanford.edu
Subject: Re: GP on Star Trek.....
Status: RO

There is, of course, the theory that our non-coding DNA consists
of comments.......

e.g.:

/***************************************************************
 * The following hack allows fins to drag stomach on ground    *
 * when out of water - easier to do this than fix bug          *
 * that causes exit from water - see related hacks in air-sac  *
 * code that allows for getting oxygen when out of water       *
 ***************************************************************/

The revision control comments alone probably take up most of
the room...

Yours in livelier, *well documented*, computation...

Chris Langton

(with a tip of the hat to Rik Belew, from whom I heard the theory...)

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 01:58:09 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09719
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 01:58:03 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id XAA13974 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 23:25:02 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from HPP.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA17610; Tue, 22 Feb 1994 23:23:59 -0800
Received: by HPP.Stanford.EDU (4.1/inc-1.0)
	id AA17014; Tue, 22 Feb 94 23:23:58 PST
Date: Tue, 22 Feb 1994 23:23:57 PST
From: James Rice <rice@camis.Stanford.EDU>
To: David Andre <phred@leland.Stanford.EDU>
Cc: genetic-programming@cs.stanford.edu
Subject: Re: GP on Star Trek..... 
In-Reply-To: Your message of Tue, 22 Feb 1994 22:05:18 -0800 (PST) 
Message-Id: <CMM.0.88.761988237.rice@hpp.Stanford.EDU>
Status: RO

This is a clear extention of GP image compression covered
in Jaws and the Movie.  Well, ... maybe not....


Rice - still won't trust the Romulans, even if we're related.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 00:48:54 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA28822
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 00:48:52 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id WAA13616 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 22:06:23 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from elaine14.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA16426; Tue, 22 Feb 1994 22:05:20 -0800
Received: from localhost (phred@localhost) by elaine14.Stanford.EDU (8.6.4/8.6.4) id WAA14358 for genetic-programming@cs.stanford.edu; Tue, 22 Feb 1994 22:05:19 -0800
From: David Andre <phred@leland.Stanford.EDU>
Message-Id: <199402230605.WAA14358@elaine14.Stanford.EDU>
Subject: GP on Star Trek.....
To: genetic-programming@cs.stanford.edu
Date: Tue, 22 Feb 1994 22:05:18 -0800 (PST)
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 746       
Status: RO

Maybe this is old news, but I was watching 'Star Trek -- The next generation',
and saw an episode in which Picard and the amazing computers (never seem to 
have trouble with computer power there...) discovered a 'program' that 
was embedded in the genetic codes of people from many different planets which
they then 'decoded' and caused a holographic image to come about and explain
the existence of the code and what not.  The code was placed by an
ancient civilization in the primordial soup of many planets, and was 
designed to bring about life similar in form to their own.  (Which explains
why all life on Star Trek is humaniod.)  Anyway, I thought it was 
interesting, and wondered if the writers had read about GP or GA's.  

David Andre

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 22:19:57 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15188
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 22:19:53 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id TAA13062 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 19:48:31 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from noc.msc.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13831; Tue, 22 Feb 1994 19:47:06 -0800
Received: from uc.msc.edu by noc.msc.edu (5.65/MSC/v3.0.1(920324))
	id AA20950; Tue, 22 Feb 94 21:46:49 -0600
Received: from et.msc.edu by uc.msc.edu (5.65/MSC/v3.0z(901212))
	id AA07123; Tue, 22 Feb 94 21:46:49 -0600
Received: by et.msc.edu (4.1/SMI-4.1)
	id AA18292; Tue, 22 Feb 94 21:46:47 CST
Date: Tue, 22 Feb 94 21:46:47 CST
From: alk@et.msc.edu (Anthony L. Kimball)
Message-Id: <9402230346.AA18292@et.msc.edu>
To: aries@media.mit.edu
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: James Rice's message of Mon, 21 Feb 1994 23:50:11 PST <CMM.0.88.761903411.rice@hpp.Stanford.EDU>
Subject: *LISP 
Status: RO


*Lisp has strong merits, but as a quondam C* compiler hack, I imagine that
my CM GP environment of choice would be global-nodal C* on a CM-5.  (C* is
ANSI C with extensions for array syntax.) In this programming model, a
global C* code performs system-wide initialization, load balance,
accumulation of results, global analysis, and data-parallel IO to a
parallel storage device such as a Scalable Disk Array, while divergent
threads of control in the GP kernel are managed by nodal C* code called
from the global thread as the regime of computation demands.  Effective use
of the vector units in such a scheme requires that the primitives include
data-parallel operations, reductions, scans, element-wise operations, etc.;
however, the dimension of data-parallelism need not be terribly large to
make effective use of a single node, and particularly when using a machine
with superscalar SPARC nodes good throughput is still achievable in the
absence of vector operations (although the apogee or even gnu compiler is
to be preferred over C* in such cases, as the C* compiler focuses its
optimization efforts quite narrowly on vector code).

If you *really* want to make effective use of your CM-2, I would suggest
contacting the research team of Guy Blelloch at CMU.  Their NESL compiler
seems a very interesting vehicle for GP work in this environment.  (I
believe a good introductory paper can be had from the 92 or 93 POPL
conference proceedings, published in SIGPLAN notices.  Any APL veteran
with the GP bug would be well advised to scout NESL, methinks.)  The
primitives of NESL are pregnant with novel approaches to data-parallel
interpretation.  Overall, I believe Rice is correct, given the state of the
art in SIMA compilation, and that the CM2 is much better adapted to GA than
GP work.  I'm sure that you've already seen published GA work alluding to
techniques for using the particular features of the CM2 architecture for
GAs... if not, let me know if you'd like me to hunt down a few references.

//alk

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 17:30:00 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA03643
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 17:11:07 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id OAA11403 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 14:13:57 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from relay1.UU.NET by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA28560; Tue, 22 Feb 1994 14:12:53 -0800
Received: from uucp6.UU.NET by relay1.UU.NET with SMTP 
	(5.61/UUNET-internet-primary) id AAweke02225; Tue, 22 Feb 94 17:12:41 -0500
Received: from drd.UUCP by uucp6.UU.NET with UUCP/RMAIL
        ; Tue, 22 Feb 1994 17:12:45 -0500
Received: from cielo.drd by drd.com (4.1/SMI-4.1)
	id AA02097; Tue, 22 Feb 94 15:42:53 CST
From: tdh@drd.com (Tom.Haynes)
Message-Id: <9402222142.AA02097@drd.com>
Subject: RE: Network parallelism
To: cs0ral@isis.sunderland.ac.uk (r.aler)
Date: Tue, 22 Feb 94 15:42:52 CST
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <9402221433.AA07526@isis.sunderland.ac.uk>; from "r.aler" at Feb 22, 94 2:33 pm
Reply-To: tom.haynes@drd.com
X-Mailer: ELM [version 2.3 PL11]
Status: RO

> 
> 
> > Well, I've just finished getting PVM working across all the machines in our
> > department and am about to ||ise a GP implementation. I'd be very interested
> 
> 	I've built a parallel version for SGPC by using PVM. The configuration
> is made of a master program (where all genetic operations take place) and
> one server for every machine (where fitnesses are calculated). Individuals are
> sent from the master to the servers and servers answer with fitness.  The server
> is chosen at random. The good point with this scheme is that you can have 
> several masters at the same time using the same servers.
> That way you keep the servers pretty busy and you have the two kinds
> of parallelism at the same time (|| at individual level and || at run level).
> However I didn't get as much increase in performance as I expected because
> plenty of time is wasted in sending individuals from the master to the servers.
> Of course, the longer the fitness evaluation takes (compared to the sending
> time), the better results you the better results you obtain (compared to the
> serial version)
> 
> 	I've also got a bunch of different machines (different speeds).
> In this scheme the server has to wait for all fitnesses to be received 
> so the total system works at the speed of the slowest server. I've thought
> of making the selection of the server adaptive depending on the time the last
> evaluation took. This is too much simple because evaluation time depends on
> the individual, the load in the machine and the machine speed. Still thinking
> on this. If someone has ideas or has tried something similar (or different)
> tell it please.
> 
> 						Ricardo Aler
> 						University of Sunderland (UK)
> 

A different approach is to adapt a Hyperswap from a Hypercube.

The basic algorithm is for each machine to carry out the GP separately.
Every X generations, each machine selects two individuals based on
fitness.  Copies of these two individuals are then sent to another
machine.  This machine decides whether to keep these individuals based on
a steady state fitness evaluation.  The two which are least fit of the
new population are dropped.

(This would tend to minimize the message handling overhead that is
impacting your execution time.)

One way to maximize the parallelism is to try and select populations
from different parts of the search space for each machine.  Diversity is
the spice of life.

The Hyperswap algorithm is how to choose which neighbor to send the
individuals to each time.  It basically cycles over the n-neighbors of
any node in a n-dimension hypercube.

This approach is written up in Knight, Leslie, and Wainwright, Roger L.,
"HYPERGEN - A Distributed Genetic Algorithm on a Hypercube",
_Proceedings of the 1992 Scalable High Performance Computing
Conference, SHPCC '92_, Williamsburg, VA, April 26-29, 1992.

-- 
Tom Haynes or more commonly => tdh@drd.com  DRD Corporation  (918)743-3013

------------------------------------------------------------------------------

From Eric.Teller@GS61.SP.CS.CMU.EDU Tue Feb 22 13:19:28 1994
Received: from GS61.SP.CS.CMU.EDU by ccwf.cc.utexas.edu with SMTP id AA22224
  (5.65c/IDA-1.4.4 for <mccoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 13:19:26 -0600
Message-Id: <199402221919.AA22224@ccwf.cc.utexas.edu>
From: Eric Teller <astro@GS61.SP.CS.CMU.EDU>
Date: Tue, 22 Feb 94 14:18:49 EST
To: mccoy@ccwf.cc.utexas.edu
In-Reply-To: Jim McCoy's message of Mon, 21 Feb 1994 15:34:41 -0600 (CST) <199402212134.AA13009@tramp.cc.utexas.edu>
Subject: a question about the Texas ftp site.
Status: RO


Hi Jim,

Could you let me know when you move the paper I "put"
yesterday into the papers directory. (Curiosities.ps)

Thanks

: )

Astro.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 12:49:45 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA20725
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 12:49:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA09454 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 10:03:59 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received:  by Sunburn.Stanford.EDU (5.67b/25-SUNBURN-eef) id AA15192; Tue, 22 Feb 1994 10:02:55 -0800
Date: Tue, 22 Feb 94 10:02:55 PST
From: John Koza <koza@CS.Stanford.EDU>
To: genetic-programming@CS.Stanford.EDU
Subject: Bauer Book
Message-Id: <CMM.0.90.4.761940175.koza@Sunburn.Stanford.EDU>
Status: RO

I just received a book entitled GENETIC ALGORITHMS AND INVESTMENT STRATEGIES
by Richard J. Bauer (Wiley ISBN 0-471-57679-4 at 800-225-5945). It's purely
GA, but looks interesting.
John Koza

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 12:13:36 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA19193
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 12:13:34 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA09169 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 09:17:04 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from netcom8.netcom.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA12417; Tue, 22 Feb 1994 09:16:00 -0800
Received: from localhost by netcom8.netcom.com (8.6.4/SMI-4.1/Netcom)
	id JAA23979; Tue, 22 Feb 1994 09:15:24 -0800
Date: Tue, 22 Feb 1994 09:15:24 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199402221715.JAA23979@netcom8.netcom.com>
To: cs0ral@isis.sunderland.ac.uk
Cc: genetic-programming@cs.stanford.edu, cs0ral@orac.sunderland.ac.uk
In-Reply-To: <9402221433.AA07526@isis.sunderland.ac.uk> (cs0ral@isis.sunderland.ac.uk)
Subject: RE: Network parallelism
Status: RO

Ricardo-
I would be very interested in getting a copy of your PVM version of
SGPC (I wrote SGPC).  Is it general, or locked in to one problem??

-walter

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 12:11:12 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA19097
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 12:11:10 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA09137 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 09:09:28 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from netcom8.netcom.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA12075; Tue, 22 Feb 1994 09:08:23 -0800
Received: from localhost by netcom8.netcom.com (8.6.4/SMI-4.1/Netcom)
	id JAA23551; Tue, 22 Feb 1994 09:09:06 -0800
Date: Tue, 22 Feb 1994 09:09:06 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199402221709.JAA23551@netcom8.netcom.com>
To: conor@ravenloft.ucc.ie
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <m0pYuhs-00004HC@ravenloft.ucc.ie> (conor@ravenloft.ucc.ie)
Subject: Re: Network parallelism.
Status: RO

After having spent a CPU year and a 40-page paper on the subject of 
distributed vs. panmictic populations, I have come to the conclusion
that a bunch of small runs with no mixing are better than a big run
whether that big run is panmictic or spatially distributed. E.G, 20
runs with a population of 500 beats one run of 10000 every time.  I think
the network parallelism is good, though...
-wt

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 10:29:37 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA13548
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 10:18:09 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id HAA08649 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 07:20:19 -0800
Received: from aries.SAIC.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA06301; Tue, 22 Feb 1994 07:19:15 -0800
Received: from deneb.saic.com.saic.com by aries.saic.com (4.1/SMI-4.1)
	id AA01319; Tue, 22 Feb 94 08:18:47 MST
Date: Tue, 22 Feb 94 08:18:47 MST
From: pothiers@aries.saic.com (Steve Pothier)
Message-Id: <9402221518.AA01319@aries.saic.com>
Received: by deneb.saic.com.saic.com (4.1/SMI-4.1)
	id AA03086; Tue, 22 Feb 94 08:18:49 MST
To: aries@media.mit.edu
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <9402220413.AA14060@media.mit.edu> (aries@media.mit.edu)
Subject: Re: *LISP
   Date: Mon, 21 Feb 94 23:13:05 -0500
   From: "Michael P. Johnson" <aries@media.mit.edu>
Status: RO


   Hello.  Perhaps someone can help me.  I am looking for a GP library for
   *LISP, on the Connection Machine, or a reason why I would be able to
   write one.  

   We need to speed up our runs, which take order 2 to 3 days on an R4000
   Indigo in Lisp.  This is too long for the preliminary experiments we are
   doing.  E.g., we'd need 20 runs (over a MONTH) to get a reasonable std.
   dev. or convergence percent.  This is with a popsize of only 500 as
   well.  The evaluation of fitness for an individual is clearly the
   bottleneck.  It is fairly expensive.  If we could parallelize the
   evaluation of a form's fitness, clearly we'd win.

   Can the CM do this?  I know very little about them.  My guess is that
   the CM5 (MIMD) could but the CM2 (SIMD) probably couldn't.  Is this
   intuition right?  (That would be too bad since I could get megatime on a
   CM2, not as much on a CM5).

   If anyone has code, experience, or suggestions on this, please let me
   know.  I am eager to start fiddling again and I think my serial version
   is optimized as much as possible.  Besides, GP is MADE to be parallel!
   It seems blasphemous to run it on a serial machine (unless, of course,
   you have 100 such machines available...)

I've used the CM1 and CM2 in the past.  I was frequently amazed at how
many algorithms that didn't seem to "fit" the SIMD architecture could
be made to work on them.

In your case the CM5 would obviously fit the problem better but there
is probably something you can do on the CM2.  I assume that part of
the reason the run time for individuals is high is that you have some
computationaly expensive "primitives".  If so you could imagine that
many (most?) of your long running individuals contained a common set
of expensive primitives.  How about running the primitives in
parallel?

-Pothier-

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 11:28:57 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15638
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 11:02:51 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id IAA08836 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 08:07:15 -0800
Received: from sun2.nsfnet-relay.ac.uk by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA07387; Tue, 22 Feb 1994 08:06:07 -0800
Via: uk.ac.sunderland.consgate; Tue, 22 Feb 1994 15:00:05 +0000
Via: isis.sunderland.ac.uk (isis.sund.ac.uk); Tue, 22 Feb 1994 14:34:00 +0000
Received: by isis.sunderland.ac.uk (4.1/SMI-4.1) id AA07526;
          Tue, 22 Feb 94 14:33:39 GMT
From: cs0ral@isis.sunderland.ac.uk (r.aler)
Message-Id: <9402221433.AA07526@isis.sunderland.ac.uk>
Subject: RE: Network parallelism
To: genetic-programming@cs.stanford.edu
Date: Tue, 22 Feb 1994 14:33:39 +0000 (GMT)
Cc: cs0ral@orac.sunderland.ac.uk
X-Mailer: ELM [version 2.4 PL22]
Content-Type: text
Content-Length: 1662
Status: RO

> Well, I've just finished getting PVM working across all the machines in our
> department and am about to ||ise a GP implementation. I'd be very interested

	I've built a parallel version for SGPC by using PVM. The configuration
is made of a master program (where all genetic operations take place) and
one server for every machine (where fitnesses are calculated). Individuals are
sent from the master to the servers and servers answer with fitness.  The server
is chosen at random. The good point with this scheme is that you can have 
several masters at the same time using the same servers.
That way you keep the servers pretty busy and you have the two kinds
of parallelism at the same time (|| at individual level and || at run level).
However I didn't get as much increase in performance as I expected because
plenty of time is wasted in sending individuals from the master to the servers.
Of course, the longer the fitness evaluation takes (compared to the sending
time), the better results you the better results you obtain (compared to the
serial version)

	I've also got a bunch of different machines (different speeds).
In this scheme the server has to wait for all fitnesses to be received 
so the total system works at the speed of the slowest server. I've thought
of making the selection of the server adaptive depending on the time the last
evaluation took. This is too much simple because evaluation time depends on
the individual, the load in the machine and the machine speed. Still thinking
on this. If someone has ideas or has tried something similar (or different)
tell it please.

						Ricardo Aler
						University of Sunderland (UK)

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 08:45:42 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09733
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 08:45:40 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id FAA08302 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 05:43:22 -0800
Received: from vnet.ibm.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA04750; Tue, 22 Feb 1994 05:42:16 -0800
Received: from OWEGO by vnet.IBM.COM (IBM VM SMTP V2R2) with BSMTP id 3982;
   Tue, 22 Feb 94 08:39:16 EST
Received: by OWEGO (XAGENTA 3.0) id 0564; Tue, 22 Feb 1994 08:41:35 -0500
Received: by barbarian.endicott.ibm.com (AIX 3.2/UCB 5.64/4.03)
          id AA16042; Tue, 22 Feb 1994 08:39:03 -0500
Date: Tue, 22 Feb 1994 08:39:03 -0500
From: <PJA@owego.vnet.ibm.com> (Peter J. Angeline)
Message-Id: <9402221339.AA16042@barbarian.endicott.ibm.com>
To: phred@leland.Stanford.EDU
Cc: aries@media.mit.edu, genetic-programming@cs.stanford.edu
In-Reply-To: David Andre's message of Mon, 21 Feb 1994 20:44:34 -0800 (PST) <199402220444.UAA27435@elaine14.Stanford.EDU>
Subject: Sorting....
Reply-To: <PJA@owego.vnet.ibm.com>
Status: RO


David Andre wrote:

> qsort yields results on a much lower generation-equiv #, but takes
> longer to run (as qsort sucks with nearly pre-sorted data).
> 'Insertion' sort doesnt do as well in terms of gen-equiv #, but
> is much faster to run.
>
> I've tested 5 different problems with different seeds, all of which
> show the above results. Out of about 20 runs, only 1 has been
> flipped the other way, where the bug falls on the side of 'insertion'.
>
> Why should flipping the individuals with the same fitness around in the
> sorted list cause such a beneficial effect?  I would have guessed it to
> be random....

There might be a number of reasons. The most likely explanation I can think of
given your description, and assuming you have no bugs, is that your method of
selecting an indiviudal from the population is somehow biased by the ordering
of the population. For instance, the chance of selecting an individual for
reproduction might decrease too quickly as it gets further from the head of the
population.  When using a sort method that doesn't reorder equivalent members,
you would generally reproduce from the same small subset of the population.
What it would give you is a weird form of premature convergence. But when using
a sort that DOES reorder equivalent individuals, you would have more variety
and thus not be subjected to the problems of reproducing with only a small
subset. The mixing would help increase the distributionof generated children.

I, for one, am not a supporter of steady state methods. I think they have some
serious problems (See DeJong and Sarma in FOGA2). But I would guess that a
steady state method that selects more uniformly from the population for
reproduction purposes and only uses fitness for the determination of who in the
population to replace, would be a better overall scheme than using fitness to
determine who to reproduce. Since you're overloading the population with
relatives, you should keep the mixing of the subtrees high.  That's what I
think qsort might be helping you do.

That's my best guess.

-pete

+----------------------------------------------------------------------------+
| Peter J. Angeline, Ph. D.     |                                            |
| Loral Federal Systems Group   |                                            |
| Rt 17C                        |       I have nothing to say,               |
| Mail Drop 0210                |               and I am saying it.          |
| Owego, NY 13827               |                                            |
| (607)751-4109                 |                               - John Cage  |
| pja@owego.vnet.ibm.com        |                                            |
+----------------------------------------------------------------------------+

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Wed Feb 23 02:44:13 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10755
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Wed, 23 Feb 1994 02:44:11 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id CAA07476 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 02:08:20 -0800
Received: from mail.netcom.com (netcom4.netcom.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA27724; Tue, 22 Feb 1994 02:07:15 -0800
Received: from localhost by mail.netcom.com (8.6.4/SMI-4.1/Netcom)
	id CAA20085; Tue, 22 Feb 1994 02:08:04 -0800
From: szabo@netcom.com (Nick Szabo)
Message-Id: <199402221008.CAA20085@mail.netcom.com>
Subject: Re: Sorting....
To: phred@leland.Stanford.EDU (David Andre)
Date: Tue, 22 Feb 1994 02:08:03 -0800 (PST)
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <199402220444.UAA27435@elaine14.Stanford.EDU> from "David Andre" at Feb 21, 94 08:44:34 pm
X-Mailer: ELM [version 2.4 PL23]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 455       
Status: RO


David Andre: 
> Why should flipping the individuals with the same fitness around 
> in the sorted list cause such a beneficial effect? 

Among unique individuals with the same fitness rank,
diversity can be lowered if some are consistently favored
over others.  Flipping the individuals probably reduces
this bias.  Even more improvement might be obtained by
purposefully reordering same-rank individuals in a random way.

Nick Szabo				szabo@netcom.com

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 05:36:55 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA05687
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 05:36:42 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id CAA07640 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 02:52:47 -0800
Received: from csvax1.ucc.ie by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA29007; Tue, 22 Feb 1994 02:51:40 -0800
Received: from ravenloft.ucc.ie by csvax1.ucc.ie (MX V3.3 VAX) with SMTP; Tue,
          22 Feb 1994 10:51:16 BST
Received: by ravenloft.ucc.ie (Smail3.1.28.1 #6) id m0pYuhs-00004HC; Tue, 22
          Feb 94 10:51 GMT
Message-Id: <m0pYuhs-00004HC@ravenloft.ucc.ie>
From: conor@ravenloft.ucc.ie (Conor Ryan)
Subject: Network parallelism.
To: genetic-programming@cs.stanford.edu
Date: Tue, 22 Feb 1994 10:51:11 +0100 (GMT)
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 1049
Status: RO

JR wrote 
> Now to parallelism:
> 
> CM2s are kind of old and they're SIMD machines.  This means
> [.. snip snip ..]
> To do this you might want to use
> LINDA (easiest, but a little slow) or PVM (fast, free, but
> harder to get going).  A number of GPers have used network
> ||ism successfully.  Simon Handley and Andy Singleton leap
> to mind.  The tricky part is finding cooperative people who
> are prepared to let you use their off peak cycles.  The
> good part is that there are zillions of off-peak cycles out
> there for the asking if you can only find them.
> 
Well, I've just finished getting PVM working across all the machines in our
department and am about to ||ise a GP implementation. I'd be very interested
in hearing how Simon or Andy went about it, did you use spatial populations?
Multiple pops? Or were individuals just farmed out to machines? The problem
with my setup is that the machines vary from some spanking new Alphas to a
couple of 486 machines, so there is quite a variety in performance.
Suggestions anyone?

Conor.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 09:35:19 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA11668
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 09:35:14 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id GAA08515 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 06:41:11 -0800
Received: from neural.hampshire.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA05703; Tue, 22 Feb 1994 06:40:02 -0800
Received: by neural.hampshire.edu (Smail3.1.28.1 #1)
	id m0pYyIc-00050EC; Tue, 22 Feb 94 09:41 GMT
Message-Id: <m0pYyIc-00050EC@neural.hampshire.edu>
Subject: Re: *LISP
To: aries@media.mit.EDU (Michael P. Johnson)
Date: Tue, 22 Feb 1994 09:41:21 +0000 (GMT)
From: "Lee Spector" <lee@neural.hampshire.edu>
Cc: genetic-programming@cs.stanford.EDU
In-Reply-To: <9402220413.AA14060@media.mit.edu> from "Michael P. Johnson" at Feb 21, 94 11:13:05 pm
Reply-To: lspector@hamp.hampshire.edu
X-Mailer: ELM [version 2.4 PL22]
Content-Type: text
Content-Length: 2043      
Status: RO

Michael P. Johnson writes:
> Hello.  Perhaps someone can help me.  I am looking for a GP library for
> *LISP, on the Connection Machine, or a reason why I would be able to
> write one.  [lots deleted]
> 
> If anyone has code, experience, or suggestions on this, please let me
> know. [lots deleted]

I agree with most that Rice had to say in reply, but I've been thinking
of another use of the CM for GP that may be of interest. A couple of years
back I helped to design a knowledge representation system for the CM called
PARKA -- it is still under development/improvement at U. Maryland. PARKA
behaves for the most part like a garden-variety "frame" system -- the
difference is that many things go much faster on the CM version. (Times
tend to depend only on the depth of the knowledge base, not the size.)
PARKA was designed for the CM2, but is being moved to the CM5. The idea
is to use || *not* to evaluate n fitness cases in ||, but to speed up
every fitness evaluation. The utility of this approach depends on the
nature of the GP application -- for most of the published stuff that I've
seen it would be useless, since PARKA wouldn't speed up the evaluation of
forms composed of +, *, %, IFLTE, etc. For anyone contemplating the use
of higher-level function sets, particularly AI-ish function sets, this
approach might make a lot of sense. Using standard serial tools one might
be facing several minutes of runtime for the evaluation of a single individual;
PARKA could reduce this to milliseconds. The PARKA calls are straightforward
and the user doesn't need to know any *Lisp -- a serial version allows
experimentation without the CM.

I realize that this is not the use of CM ||ism that Michael had in
mind... If anyone else is interested in this special-purpose CM/GP connection
let me know -- I'd like to have another excuse to work on it.

Lee Spector
Assistant Professor of Computer Science
School of Communications and Cognitive Science
Hampshire College
Amherst, Massachusetts 01002-5001
e-mail: lspector@hamp.hampshire.edu

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 02:49:37 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA03342
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 02:49:35 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id XAA06954 for <Genetic-Programming@list.stanford.edu>; Mon, 21 Feb 1994 23:51:18 -0800
Received: from HPP.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA26657; Mon, 21 Feb 1994 23:50:15 -0800
Received: by HPP.Stanford.EDU (4.1/inc-1.0)
	id AA27752; Mon, 21 Feb 94 23:50:12 PST
Date: Mon, 21 Feb 1994 23:50:11 PST
From: James Rice <rice@camis.Stanford.EDU>
To: "Michael P. Johnson" <aries@media.mit.edu>
Cc: genetic-programming@cs.stanford.edu
Subject: Re: *LISP 
In-Reply-To: Your message of Mon, 21 Feb 94 23:13:05 -0500 
Message-Id: <CMM.0.88.761903411.rice@hpp.Stanford.EDU>
Status: RO

I suppose you'll get the normal "you should be doing it in
C, you'd get it to be (insert hyperbole here) times
faster", but I'll chip in a bit, even though I've never
used a CM.

You'll probably find that you can squeeze about an order of
magnitude in speedup out of your existing fitness function
by some judicious optimisation.  This really is possible.
A student of JK's who was working in the office next to
mine last year got >20X by exactly the process I outline
below.  This all rather depends on the Lisp implementation
you're using, but if you're using Lucid then this is what
you do:

(proclaim '(optimize (speed 3) (safety 0) (compilation-speed 0)))

and then recompile your whole system.

Then say 

(start-backtrace-logging "some file name")

There's an optional arg to start-backtrace-logging that
will increase the ammount of logging it will do.  I don't
remember whether it's an optarg or a keyword arg.  Check
apropos to be sure.

Now run your code in its normal manner for a while.

(stop-backtrace-logging)

(summarize-backtrace-logging "some file name")

There are a bunch of keyword args to the above function
that you'll find with describe.  The important one
increases the default depth of the printout it gives of
your time profile.

Now you know where you're time's going and you can focus on
optimising those functions.

(compiler-options :show-optimizations t)

and then recompile the functions that are eating your time.
The compiler will then tell you what it failer to optimise
because of a lack of type declarations.  Put in the
relevant declarations, recompile and retest, iterating
until you've cranked it up.  Obviously, the metering may
reveal that you've got some serious algorithmic bogosities
anyway.

It is claimed (and I am tempted to agree with the claim)
that a good Lisp compiler will generate code every bit as
efficient as C AS LONG AS YOU PROVIDE IT WITH THE CORRECT
(POSSIBLY IMPLEMENTATION- SPECIFIC) TYPE DECLARATIONS.  The
fact that you do this in Lisp after you've got your code
running and only spend your time on time- critical
functions might be held to be a benefit.  The down-side is
that you're paying a 20Mb up-front cost in application size
just to get in the game.

You'd have to look in your docs if you aren't using Lucid.
If you're using KCL (or a derivative) ignore everything
I've said, you're hosed.

If fast-eval ends up at the top of the list then you can
recode it so that it cases on the function arg inside your
particular version of fast-eval and then calls the
functions directly inline rather than funcalling them.
This means that if (for example) you have +, -, *, % in
your function set then (with suitable type declarations)
they will get open-coded into your (now problem-specific)
fast-eval and will boil down to just a couple of machine
instructions.

In any decent Lisp implementation, you can always write
your inner loop in C if you really don't believe that you
can get the performance of C at no extra cost.


Now to parallelism:

CM2s are kind of old and they're SIMD machines.  This means
that you can get huge speedup out of them if you've got a
VERY crystaline problem that requires that you do basically
the same thing to a whole bunch of different data items.
If your application (say) involves zipping over a huge
array of pixels you're in with a chance of getting speedup,
but if you're thinking of running each individual in the
population on each PE then think again.

Parallel programming is not for the faint of heart, so you
might want to think twice about getting into this whole
area.  The CM5 has a completely different architecture that
would allow you rto run different members of the population
in parallel.  However, as I understand it, to get peak
performance out of a CM5 you need to use the vector units
properly.  Each processing element has its own kind of
mini-SIMD vector processor.  The vector units account for
the bulk of the performance of the machine, the SPARC
processors that reside at each node are really only there
as a) communications processors and b) processors to keep
the vector units fed with stuff to do.  If your application
can be split up into chunks that involve a lot of
matrix/array/vector hacking (though possibly in relatively
small chunks, then this could be the machine for you.  I
believe that we have readers from TMC who wil be able to
disabuse you of any noise in this message.

The bottom line is that parallelism isn't necessarily easy,
though GP screams out for it.  It may be that your easiest
option would be to use workstation clusters.  This is just
fine over an ethernet, especially if your fitness
computation is expensive.  To do this you might want to use
LINDA (easiest, but a little slow) or PVM (fast, free, but
harder to get going).  A number of GPers have used network
||ism successfully.  Simon Handley and Andy Singleton leap
to mind.  The tricky part is finding cooperative people who
are prepared to let you use their off peak cycles.  The
good part is that there are zillions of off-peak cycles out
there for the asking if you can only find them.

Sorry about the hurried note.  This is just a brief pause
in proof reading Jaws-2.

See you all when the dust settles (ha!),


Rice.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 02:10:14 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA02549
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 02:05:15 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id XAA06850 for <Genetic-Programming@list.stanford.edu>; Mon, 21 Feb 1994 23:21:05 -0800
Received: from netcom8.netcom.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA26304; Mon, 21 Feb 1994 23:20:02 -0800
Received: from localhost by netcom8.netcom.com (8.6.4/SMI-4.1/Netcom)
	id XAA05098; Mon, 21 Feb 1994 23:20:44 -0800
Date: Mon, 21 Feb 1994 23:20:44 -0800
From: order@netcom.com (Walter Alden Tackett)
Message-Id: <199402220720.XAA05098@netcom8.netcom.com>
To: aries@media.mit.edu
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <9402220413.AA14060@media.mit.edu> (aries@media.mit.edu)
Subject: Re: *LISP
Status: RO

Michael-
You can get a 20-50 factor of improvement by using a GP system written in
c.  There are a variety of them, including SGPC written by yours truly
which is pretty much bug-free.  There is another unix-type version out
there for c++, but word is it's buggy.  there's also andy singleton's
gpquick, but that may be specialized to the PC - i couldn't say for
sure.  i think that most of these can be found at ftp.cc.utexas.edu.
C is certainly your cheapest and most cost-effective alternative.

-walter tackett
 

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 21 23:39:01 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA10411
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 21 Feb 1994 23:38:59 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA06210 for <Genetic-Programming@list.stanford.edu>; Mon, 21 Feb 1994 20:45:42 -0800
Received: from elaine14.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA24464; Mon, 21 Feb 1994 20:44:38 -0800
Received: from localhost (phred@localhost) by elaine14.Stanford.EDU (8.6.4/8.6.4) id UAA27435; Mon, 21 Feb 1994 20:44:35 -0800
From: David Andre <phred@leland.Stanford.EDU>
Message-Id: <199402220444.UAA27435@elaine14.Stanford.EDU>
Subject: Re: Sorting....
To: aries@media.mit.edu (Michael P. Johnson)
Date: Mon, 21 Feb 1994 20:44:34 -0800 (PST)
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <9402220352.AA12884@media.mit.edu> from "Michael P. Johnson" at Feb 21, 94 10:52:32 pm
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 904       
Status: RO

Well, I am running more runs, but it appears to be quite stable in 
the following:

qsort yields results on a much lower generation-equiv #, but takes 
longer to run (as qsort sucks with nearly pre-sorted data).  
'Insertion' sort doesnt do as well in terms of gen-equiv #, but 
is much faster to run.  

I've tested 5 different problems with different seeds, all of which
show the above results. Out of about 20 runs, only 1 has been 
flipped the other way, where the bug falls on the side of 'insertion'.


Why should flipping the individuals with the same fitness around in the 
sorted list cause such a beneficial effect?  I would have guessed it to
be random....

If 20 to 1 seems like it could still be 'chance' let me know.  Of course,
I probably have a bug.  :->.  Just was curious if there was any 'wisdom' 
that had already decided qsort should be used/not used, etc.  Thanks....


David Andre

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 21 23:14:15 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09625
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 21 Feb 1994 23:14:13 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA06096 for <Genetic-Programming@list.stanford.edu>; Mon, 21 Feb 1994 20:14:10 -0800
Received: from media.mit.edu (media-lab.media.mit.edu) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA24109; Mon, 21 Feb 1994 20:13:06 -0800
Received: by media.mit.edu (5.57/DA1.0.4.amt)
	id AA14060; Mon, 21 Feb 94 23:13:05 -0500
Date: Mon, 21 Feb 94 23:13:05 -0500
From: "Michael P. Johnson" <aries@media.mit.edu>
Message-Id: <9402220413.AA14060@media.mit.edu>
To: genetic-programming@cs.stanford.edu
Subject: *LISP
Status: RO


Hello.  Perhaps someone can help me.  I am looking for a GP library for
*LISP, on the Connection Machine, or a reason why I would be able to
write one.  

We need to speed up our runs, which take order 2 to 3 days on an R4000
Indigo in Lisp.  This is too long for the preliminary experiments we are
doing.  E.g., we'd need 20 runs (over a MONTH) to get a reasonable std.
dev. or convergence percent.  This is with a popsize of only 500 as
well.  The evaluation of fitness for an individual is clearly the
bottleneck.  It is fairly expensive.  If we could parallelize the
evaluation of a form's fitness, clearly we'd win.

Can the CM do this?  I know very little about them.  My guess is that
the CM5 (MIMD) could but the CM2 (SIMD) probably couldn't.  Is this
intuition right?  (That would be too bad since I could get megatime on a
CM2, not as much on a CM5).

If anyone has code, experience, or suggestions on this, please let me
know.  I am eager to start fiddling again and I think my serial version
is optimized as much as possible.  Besides, GP is MADE to be parallel!
It seems blasphemous to run it on a serial machine (unless, of course,
you have 100 such machines available...)

Thanks,

-Mike
aries@media.mit.edu

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 21 19:52:46 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA02692
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 21 Feb 1994 19:52:44 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id RAA05341 for <Genetic-Programming@list.stanford.edu>; Mon, 21 Feb 1994 17:06:24 -0800
Received: from elaine18.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA20791; Mon, 21 Feb 1994 17:05:20 -0800
Received: from localhost (phred@localhost) by elaine18.Stanford.EDU (8.6.4/8.6.4) id RAA07461 for genetic-programming@cs.stanford.edu; Mon, 21 Feb 1994 17:05:15 -0800
From: David Andre <phred@leland.Stanford.EDU>
Message-Id: <199402220105.RAA07461@elaine18.Stanford.EDU>
Subject: Sorting....
To: genetic-programming@cs.stanford.edu
Date: Mon, 21 Feb 1994 17:05:14 -0800 (PST)
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 882       
Status: RO

Howdy, all.

I'm experiencing a strange bug as I am testing a gp-core in C that I have 
written.  I am using Steady State GP, and have experimented first
with evaluating the fitness  the two new individuals and then 
sorting them into the population using a qsort (wasteful on time).  Then,
I wrote a simple routine to simply insert the individuals into
their correct locations.  To my suprise, the two algorithms perform
differently!.  Apparently, qsort is 'flipping' the order of several of the
individuals with identical fitness, whereas the insertion algorithm 
leaves  individuals with the same score in the same place.  For example, 

in attempting to solve the regression problem for 

12x^3 + 5x^2 + x,

using the qsort method I find a solution in 3419 'births', whereas
using the insertion method, I find no solutions even after 10000 individuals.

Any ideas?

David Andre

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 22 22:14:05 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15008
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 22 Feb 1994 22:14:03 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id TAA12932 for <Genetic-Programming@list.stanford.edu>; Tue, 22 Feb 1994 19:31:50 -0800
Errors-To: mail-errors@list.Stanford.EDU
Received: from worldlink.worldlink.com (worldlink.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13592; Tue, 22 Feb 1994 19:30:43 -0800
Received: by worldlink.worldlink.com (5.65b/4.0.071791-Worldlink)
	id AA12287; Tue, 22 Feb 94 22:29:15 -0500
Message-Id: <2970958767.0.p00396@psilink.com>
Date: Mon, 21 Feb 94 18:20:57 -0500
To: "GP list" <genetic-programming@cs.stanford.edu>
From: "Andrew Singleton" <p00396@psilink.com>
Organization: Creation Mechanics
Subject: New version of GPQUICK
X-Mailer: PSILink-DOS (3.4)
Status: RO

In atonement for past sins, I am offering an updated version of GPQUICK.

The files GPQUICK.ZIP (DOS/ANSI) and GPQUICK.TAR (UNIX) in the archive
at host <ftp.cc.utexas.edu>, directory </pub/genetic-programming/code>, are 
new and improved.

The new version of GPQUICK has the following improvements:

Unified UNIX/DOS/ANSI source
Several bug fixes
More extensible object structure
FitnessValue object for carrying problem specific fitness information
Easier access to GA parameters
Streamlined Problem files
Improved function constructor
Standard primitives, including a working IF and IFLTE.
Annealing
2 more mutations
Artificial Ant problem

****************************************************************

I apologize that the "IF" function I sent out with the last version did not 
work with FASTEVAL.  Portability should be better with this version.

The new version already has a known problem which was also in the last 
version:  It doesn't check for successful memory allocation when allocating 
a new population.  Users should add this check.  If you run it under DOS 
with a population bigger than 2000, it WILL blow up.  If you use a PC, I 
strongly recommend that you use Windows or some some other memory enhancer.

The ant problem is a faithful rendition of the original Koza experiment 
which makes a useful benchmark, as per previous communications.

"Annealing" (so described by Dan Adler) is a genetic operation that replaces 
the parent if the child is better, rather than reproducing into a new slot.  
If applied at more than a 1:1 ratio to reproduction, it has the effect of 
giving you the convergence characteristics of a larger population.  For 
instance, if you do 6 replacements for every reproduction on a population of 
1000, you get behavior similar to a population of 6000.
	Annealing also allows you to apply mutation at a higher dosage, which 
can cure some of the problems inherent in crossover based GP.  For instance, 
if you use it with a mutation that jiggles constants, you get better hill 
climbing and optimization behavior.  If you use it with a mutation that 
"shrinks" subtrees, you can directly attack "defense against crossover" 
expression bloat.  I have provided both constant mutations and shrink 
mutations for this purpose.
	I also hypothesize that because annealing is so greedy, it reduces the 
incentive for conservative (converging) behavior that produces expression 
bloat.

The FitnessValue object, for carrying problem-specific fitness information, 
was added by popular demand after a discussion last December on the GP list  
You can add your own data and "IsBetter" method.  This is part of a quest to 
define a standardized Problem class, which I encourage others to contribute 
to.

****************************************************************

I have noticed some competitive comments implying that "my implementation is 
better than yours."  I don't think that this is helpful.  We are talking 
about Freeware, and it can only grow by constructive collaboration.  I 
released GPQUICK as an educational exercise and to satisfy my publishers 
request for demo code, not to prove a point.  I imagine that other 
implementers have a similar story, and I thank them for their time and 
ideas.

Having looked at GPQUICK, SGPC, GEPPETTO, and GPCPLUS, I must say that they 
all seem similar.  They all implement a plain vanilla S-Expression GP, and 
differ only in the details, not the result.  I suspect that Nil's effort is 
in the same category.  If you want to be competitive, channel your energies 
into advancing the state of the art.  Do something original.  Here are some 
examples and suggestions:

Innovative architecture:  Go beyond S-Expressions.  Try a stack 
implementation, as suggested by Tim Perkis, or a machine language 
implementation like the one contributed by Peter Norden.  What about vector 
data types and other embedded data?

Innovative learning mechanisms:  Did you know that you can do delta rule 
backpropogation on GP expressions by applying the chain rule?  What about 
new types of mutation and crossover?  Sid Maxwell suggests "co-routines", 
which suggest other types of population management.  I have provided 
"annealing" operations in GPQUICK.

Initialization and parameterization:  What about methods for building a 
library of GP individuals?  How can they be reused?  What about enrichment 
of gen 0, as suggested by John Perry, or methods for selecting primitives 
from gen 0 results.  Do you provide unique and interesting primitives?  
Walter Tackett gave us demes and multi-objective selection in SGPC.  Can we 
explore new parameters?

Methods for collaboration:  What about a simple mechanism for distributed GP 
on the internet?  What about defining new and ever more interesting GP 
problems?  Koza has been great at generating "toy problems" for us to play 
with.  I am thinking about issuing a database for a simple speech 
recognition problem.  There must be other interesting tasks and datasets 
ready for release.

Building hierarchy: Moving to more complicated problems is the biggest 
obstacle we face.  John Koza has given us ADF's, and Adam Fraser has worked 
on putting them into GPCPLUS.  I hope to provide a mechanism that I call 
"chains".  Look at new mechanisms and try to improve them.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Mon Feb 21 10:29:23 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA07123
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Mon, 21 Feb 1994 10:01:51 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id HAA01768 for <Genetic-Programming@list.stanford.edu>; Mon, 21 Feb 1994 07:00:57 -0800
Received: from sun2.nsfnet-relay.ac.uk by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA07867; Mon, 21 Feb 1994 06:59:52 -0800
Via: uk.ac.bristol.compsci; Mon, 21 Feb 1994 14:58:20 +0000
Received: from danno by kukini.compsci.bristol.ac.uk id aa07455;
          21 Feb 94 14:59 GMT
To: genetic-programming@cs.stanford.edu
Date: Mon, 21 Feb 94 14:58:55 GMT
From: schenk@cs.bris.ac.uk
Sender: schenk@cs.bris.ac.uk
Message-Id: <9402211458.aa07139@uk.ac.bristol.compsci.danno>
Status: RO

Hi !
( those of you who don't program in Scheme, can stop reading here )

Currently I am trying to implement a GP-system in Scheme
(using TI's PC-Scheme)

As I am new to Scheme (new to any LISP-dialect) I have a few questions 
and hope you might be able to answer them :

1. I am using PC-Scheme 3.03. (the only version I could get). Is there
a newer/faster/smaller... version of it or (even better) some FTP-able Scheme 
that runs on a 386-33 4MB (and limited HD-space)

2. As I am new to the language I am tempted to use as many (fancy)
constructs as possible. One of them is call/cc which I used in the
following code : ( first example without call/cc : the tag -1 indicates
that I have found the node I was looking for, so just return )

(define (getSubtree tree cnt)
  (cond ((=? 0 cnt) (cons -1 tree))      ; cnt == 0 -> found the node
        ((atom? tree) (cons 1 tree))     ; atom? -> its a terminal
        (else
          (do ((args (cdr tree) (cdr args))
               (addem-up 1)
               (res (getSubtree (cadr tree) (-1+ cnt))
                    (getSubtree (cadr args) (- cnt addem-up))))
              ((or (null? (cdr args)) (= -1 (car res)))
               (if (= -1 (car res))
                   res
                   (cons (+ (car res) addem-up) (cdr res))))
              (set! addem-up (+ addem-up (car res)))))))

In this version instead of passing the -1 up I simply exit by
using call/cc.

(define (getSubtree1 subtree level)
  (call/cc
    (lambda (future)
      (define gs-loop
        (lambda (tree cnt)
          (cond ((=? 0 cnt) (future tree))     ; return this tree
                ((atom? tree) (cons 1 tree))   ; cons 1 ! : count terminal
                (else
                  (do ((args (cdr tree) (cdr args))
                       (addem-up 1)
                       (res (gs-loop (cadr tree) (-1+ cnt))
                            (gs-loop (cadr args) (- cnt addem-up))))
                      ((null? (cdr args))     ; exit cond
                       (cons (+ (car res) addem-up) (cdr res)))
                      (set! addem-up (+ addem-up (car res))))))))  ;body 
      (gs-loop subtree level))))

But, it isn't faster at all, on the contrary : the garbage-collection
is called more often. So, is using call/cc only useful for
error-handling ?

3. Here my main problem (at the moment) :
For the crossover-operator and the mutation-operator I need a
procedure which returns a pointer/reference to a subtree so that
I can change that subtree (in Koza's code that is done by 
lots of mysterious multiple-value-bind-i-dont-understand-statements)

In the code above I tried replacing this line :

          (cond ((=? 0 cnt) (future tree))    

by

  (cond ((=? 0 cnt) (future (set! tree ...to-some-extra-argument)))

(or set-car! if it is a list)

I hoped to destructively have destroyed the original tree
(which of course didn't work which is why I am sending this message)

I'd be very grateful if you could help me on this one, maybe simply
suggest some useful references (and of course, some constructive
criticism on the above code is always welcome, I am sure it could
be done more elegantly)

Thanks a lot

Veit

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sat Feb 19 18:19:33 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA11477
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sat, 19 Feb 1994 18:19:24 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id PAA22778 for <Genetic-Programming@list.stanford.edu>; Sat, 19 Feb 1994 15:05:16 -0800
Received: from MIT.EDU (ATHENA-AS-WELL.MIT.EDU) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA02411; Sat, 19 Feb 1994 15:04:13 -0800
Received: from CADLAB6.MIT.EDU by MIT.EDU with SMTP
	id AA09189; Sat, 19 Feb 94 17:59:14 EST
Received: by cadlab6.MIT.EDU (5.61/4.7) id AA05696; Sat, 19 Feb 94 17:59:05 -0500
Message-Id: <9402192259.AA05696@cadlab6.MIT.EDU>
To: GA-List@AIC.NRL.NAVY.MIL, alife@cognet.ucla.edu,
        genetic-programming@cs.stanford.edu, eaga-dist@MIT.EDU,
        ep-list@magenta.me.fau.edu
Cc: jakiela@MIT.EDU
Subject: ECs in Industrial Organization Design
Date: Sat, 19 Feb 94 17:59:02 EST
From: "Kazuhiro M. Saito" <kazu@MIT.EDU>
Status: RO

Hello,

I posted a question on ECs in micro-economics, and just would like to
thank all the people responded me.  I have another question from my advisor:

_______________ Beginning of the broadcast ___________________________

Helpful Colleague:

I am trying to locate some literature on the applications of
evolutionary computation procedures (i.e. genetic algorithms, genetic
programming, Alife, perhaps connectionist models) to problems in the
design, simulation, and analysis of industrial organizations and
industrial production management systems.

How can evolutionary computation be used to analyze and design WORK
CELLS, WORK GROUPS, PRODUCT DATA MANAGEMENT SYSTEMS, MANUFACTURING
ORGANIZATIONS, ENTIRE FIRMS, WHOLE INDUSTRIES, etc.?

Note that this is in contrast to work that (i) more strictly deals with
economics, such as artificial stock markets; and (ii) more "lower level"
well-defined problems such as process planning and shop flow scheduling.

Any suggestions would be valuable.  Thank you very much for your help.

Mark Jakiela
MIT - Department of Mechanical Engineering

_______________ End of the broadcast ___________________________

Any info appreciated.

Thanks in advance,

Kazuhiro M. Saito
Graduate Student Reseach Assistant
MIT CAD Laboratory
kazu@mit.edu

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 18 22:45:28 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA00849
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 18 Feb 1994 22:45:26 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id UAA18457 for <Genetic-Programming@list.stanford.edu>; Fri, 18 Feb 1994 20:07:52 -0800
Received:  by Sunburn.Stanford.EDU (5.67b/25-SUNBURN-eef) id AA15537; Fri, 18 Feb 1994 20:06:49 -0800
Date: Fri, 18 Feb 94 20:06:48 PST
From: John Koza <koza@CS.Stanford.EDU>
To: genetic-Programming@CS.Stanford.EDU
Subject: [Jamal <DFSKSM%UTMKL.BITNET@forsythe.Stanford.EDU>: Re: Genetic
        Programming]
Message-Id: <CMM.0.90.4.761630808.koza@Sunburn.Stanford.EDU>
Status: RO

Return-Path: <DFSKSM@UTMKL.BITNET>
Received: from forsythe.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA06546; Thu, 17 Feb 1994 23:22:27 -0800
Message-Id: <199402180722.AA06546@Sunburn.Stanford.EDU>
Received: by Forsythe.Stanford.EDU; Thu, 17 Feb 94 23:22:20 PST
Received: from UTMKL (DFSKSM) by UTMKL (Mailer R2.08 PTF008) with BSMTP id
 6122; Fri, 18 Feb 94 15:18:59 MAL
Date:         Fri, 18 Feb 94 15:12:21 MAL
From: Jamal <DFSKSM%UTMKL.BITNET@forsythe.Stanford.EDU>
Organization: Universiti Teknologi Malaysia, Kuala Lumpur
Subject:      Re: Genetic Programming
To: John Koza <koza@CS.Stanford.EDU>
In-Reply-To:  Your message of Wed, 19 Jan 94 21:50:45 PST
X-Acknowledge-To: <DFSKSM@UTMKL>

I am sorry for not responding ealier.

I appreciate if you can mention other places/persons that may be available
for me to learn gp (from scratch) for about 1 month. Asian Dev Bank (my sponsor
) is willing to pay for any fees incurred (plus my expenses going and staying t
here). Maybe you can help me post this to a gp bulletin board(if there is any).
Thank you.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 18 09:30:37 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA29656
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 18 Feb 1994 09:04:01 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id DAA13863 for <Genetic-Programming@list.stanford.edu>; Fri, 18 Feb 1994 03:39:06 -0800
Received: from mail.netcom.com (netcom6.netcom.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA12354; Fri, 18 Feb 1994 03:38:02 -0800
Received: from localhost by mail.netcom.com (8.6.4/SMI-4.1/Netcom)
	id DAA08007; Fri, 18 Feb 1994 03:38:50 -0800
From: szabo@netcom.com (Nick Szabo)
Message-Id: <199402181138.DAA08007@mail.netcom.com>
Subject: Measuring Complexity
To: genetic-programming@cs.stanford.edu
Date: Fri, 18 Feb 1994 03:38:49 -0800 (PST)
X-Mailer: ELM [version 2.4 PL23]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 5601      
Status: RO


Greetings GPer's,

I recently posted a version of this essay to sci.nanotech.
It's quite a bit more accessible than my recent post on GP & 
Induction theory, and covers many of the most important
concepts therein, albeit in terms that are less formal
and not specifically applied to GP.

---------------------------------------------------------

Is complexity valuable or costly?  Can we measure the complexity 
of an object?  

Counting the number of unique bits needed to describe an object
starts down the right path, but we need to go further.
A hot gas "contains", or requires to be described, a large 
number of unique bits, just as does the DNA of a California 
condor.  The concept of _logical depth_, which measures the 
amount of computation needed to produce the bit pattern (to 
discriminate it from other possible bit patterns), comes 
closer to measuring value.  The hot gas is logically shallow
because the process that creates it doesn't discriminate 
between the possible configurations of "bits", and it's trivial 
to create such another such pattern in that class.  The California 
condor DNA is logically deep because evolution has over several 
million years has discriminated it from its nearest non-endangered 
relative and a vast number of failed bird configurations.

Uniqueness can now be factored back in.  A cow by itself is as
logically deep as a condor (perhaps a little deeper if we consider 
intelligent breeding superior to natural selection), but
one cow among a billion contains far fewer unique deep bits 
because its DNA patterns are copied across the billion other 
cows with much higher probability than the condor's across the 
handful of other condors.  

Logically deep objects tend to be valuable because
they tend to be reusable: a part evolved, designed,
or computed for one use is more likely to be useful for
something else.  But there are plenty of exceptions.
The last California condor costs much more than the 1 billionth 
cow to replace, but if I'm hungry enough I might well feed
the cow before I feed the condor.

Thus logical depth in some cases correlates to value, but what 
it measures objectively is _cost_.  To call something valuable 
just because it is expensive is to fall into a trap similar to Marx's 
labor theory of value.

In the case of tools, all other things being equal they are
more valuable if they are simpler, if they contain _fewer_ bits.
Simpler tools are less expensive to design, make, and/or use.
A knife requires more information in our brains to use
as effectively as a gun, therefore it's less valuable for the user
(all other things being equal).  A valuable tool is usually
both as function and as simple as possible.  Computation
(evolution, learning, design effort, etc.) is needed both to create 
a structure deeply adapted to its environment or desired use, and 
to find the simplest form for that functionality.  This happens
both in the the process of designing a tool and using it.  In
both cases funtionality vs. simplicity is a fundamental tradeoff.
Objects on which a great deal of evolution/design/computational
effort have been spent to obtain functionality and simplicity
are logically deep.

The MacIntosh user interface shows that it often pays to move complexity 
out of the mind of the user and into the mind of the designer -- up to
a point.  Better still is to drive complexity into the "mind"
of the software itself, for example moving the complexity
of program language translation and optimization out of the 
mind of the programmer and into the compiler.  A main goal
of evolutionary design software is to reduce the complexity
that needs to be specified by the human engineer while greatly
increasing the amount of functionality (and concommitant complexity)
that can be designed into the product.

A simple scientific theory is also more valuable than a complex
one that explains the same data -- both because it is more
likely to be true, and because it is easier to understand
and communicate.  

A formal proof of Occam's Razor and the formula 1 - c1^(|p|-|x|+c2) 
that gives the probability of the regularity/predictive power of a model p
to explain data x, can be found in Li & Vitanyi, _An Introduction to 
Kolmogorov Complexity_, Springer & Verlag 1993, along with computationally
formal measures of logical depth, information content, and much
else.  Recommended for those who like mathemetical challenge, and looking
up references in the library when Li & Vitanyi condense entire papers 
into a single sentence or formula!

Dani Eder observed in a recent post to sci.nanotech that scientific 
data from new instruments and frontiers tends to be more valauble 
than old repetitive scientific data.   Old data have already been
gleaned for their information content.  To create new, unique
theories we need new, unique information: more bits of 
precision, different wavelengths, different phenomenon, etc.

Formalizations of information content, design cost, and 
scientific induction relate intimately to our interest in 
evolutionary design.  We can measure cost, and in some
cases value, by determining the logical depth of an object.
In evoluntarionary design we might use Kolmogorov complexity theory 
to trade off the simplicity, error, and computational costs of
our simulations, measure the effects of fitness and selection 
functions, choose design primitives, and to generalize and predict 
the consequences of these choices.  Much promise
lies in uniting the new formal theories of general
induction with evolutionary search techniques.

Nick Szabo				szabo@netcom.com

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 17 16:52:35 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15191
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 17 Feb 1994 16:52:31 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA10399 for <Genetic-Programming@list.stanford.edu>; Thu, 17 Feb 1994 13:50:43 -0800
Received: from WLV.IIPO.GTEGSC.COM by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA20831; Thu, 17 Feb 1994 13:49:39 -0800
Received: from MX.IIPO.GTEGSC.COM by WLV.IIPO.GTEGSC.COM (5.67/1.35)
	id AA05301; Thu, 17 Feb 94 13:49:38 -0800
Received: by MX.IIPO.GTEGSC.COM with Microsoft Mail
	id <2D63E620@MX.IIPO.GTEGSC.COM>; Thu, 17 Feb 94 13:48:16 PST
From: "Burman, J A (Jerry)" <BurmanJ%HOST2@WLV.IIPO.GTEGSC.COM>
To: Genetic Program <genetic-programming@cs.stanford.edu>
Subject: Recent Mach Learn Conf
Date: Thu, 17 Feb 94 13:50:00 PST
Message-Id: <2D63E620@MX.IIPO.GTEGSC.COM>
Encoding: 10 TEXT
X-Mailer: Microsoft Mail V3.0
Status: RO


>Hi all GPers:
>I recently submitted a paper which addresses some of the issues of
>practical interest from your message to the Machine Learning
>Conference:
>"Hierarchical Self-Organization in Genetic Programming"
>If you would like a copy, I'd be happy to forward you one.
>Justinian Rosca

What is your E-Mail address Justinian?

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 17 16:52:13 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA15174
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 17 Feb 1994 16:52:10 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA09976 for <Genetic-Programming@list.stanford.edu>; Thu, 17 Feb 1994 13:10:27 -0800
Received: from HPP.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA18431; Thu, 17 Feb 1994 13:09:22 -0800
Received: from KSL-EXP-35 (KSL-EXP-35.Stanford.EDU) by HPP.Stanford.EDU (4.1/inc-1.0)
	id AA04111; Thu, 17 Feb 94 13:09:12 PST
Message-Id: <2970508148-5756100@KSL-EXP-35>
Sender: RICE@KSL-EXP-35.Stanford.EDU
Date: Thu, 17 Feb 94  13:09:08 PST
From: James Rice <Rice@HPP.Stanford.EDU>
To: szabo@netcom.com (Nick Szabo), genetic-programming@cs.stanford.edu
Subject: Re: GP & Induction
In-Reply-To: <199402170921.BAA03183@netcom9.netcom.com>
Status: RO

[... Nick asks a bunch of questions that are provably 
 isomorphic to asking the mean air speed of an unladen
 swallow.]


                            _
                          -   -
                        /      \
                      /         \
                    /            \
I don't know that! ^              \
                                   \
                                    \
                                     |
                                     |
                                     |
                                     |
                                    =:-O  Aaaaaarrrrggghhhhh


A hair-raising plunge into the gorge of doom.......


*** All Un/Subscribe messages should go to      ***
*** genetic-programming-REQUEST@cs.stanford.edu ***
***                    ^^^^^^^^                 ***

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 17 15:47:32 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09968
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 17 Feb 1994 14:52:42 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA09435 for <Genetic-Programming@list.stanford.edu>; Thu, 17 Feb 1994 11:32:07 -0800
Received: from HPP.Stanford.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA14163; Thu, 17 Feb 1994 11:31:03 -0800
Received: from KSL-EXP-35 (KSL-EXP-35.Stanford.EDU) by HPP.Stanford.EDU (4.1/inc-1.0)
	id AA02348; Thu, 17 Feb 94 11:30:49 PST
Resent-Message-Id: <2970502241-5401241@KSL-EXP-35>
Resent-Sender: RICE@KSL-EXP-35.Stanford.EDU
Resent-Date: Thu, 17 Feb 94  11:30:41 PST
Resent-From: James Rice <Rice@HPP.Stanford.EDU>
Resent-To: genetic-programming@cs.stanford.edu
Received: from list.Stanford.EDU by HPP.Stanford.EDU (4.1/inc-1.0)
	id AA01328; Thu, 17 Feb 94 10:48:07 PST
Received: from cayuga.cs.rochester.edu (cayuga.cs.rochester.edu [192.5.53.209]) by list.Stanford.EDU (8.6.4/8.6.4) with ESMTP id KAA09200 for <genetic-programming-owner@list.Stanford.EDU>; Thu, 17 Feb 1994 10:49:03 -0800
From: rosca@cs.rochester.edu
Received: from honeydew.cs.rochester.edu (honeydew.cs.rochester.edu [192.5.53.88]) by cayuga.cs.rochester.edu (8.6.4/E) with ESMTP id NAA00938; Thu, 17 Feb 1994 13:47:01 -0500
Received: from localhost (rosca@localhost) by honeydew.cs.rochester.edu (8.6.4/E) id NAA25692; Thu, 17 Feb 1994 13:46:53 -0500
Date: Thu, 17 Feb 1994 13:46:53 -0500
Message-Id: <199402171846.NAA25692@honeydew.cs.rochester.edu>
To: szabo@netcom.com
Subject: GP & Induction
Cc: genetic-programming-owner@list.Stanford.EDU
Status: RO


Hi Nick (and all GPers),

I recently submitted a paper which addresses some of the issues of
practical interest from your message to the Machine Learning
Conference:

"Hierarchical Self-Organization in Genetic Programming"

The paper describes a slightly different approach to automatically
defined functions, somewhere between ADFs (John Koza) and modules
(Pete Angeline), called adaptive representation GP in which
arbitrarily complex hierarchies of functions can emerge. I formalized
several measures of complexity of program trees for the "automatic
discovery of functions" approaches. Based on descriptional complexity,
I used the MDL principle to define a general formula that can be used
for fitness evaluation.

If you would like a copy, I'd be happy to forward you one.

Justinian Rosca

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 17 11:00:54 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA00216
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 17 Feb 1994 10:30:16 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id HAA08276 for <Genetic-Programming@list.stanford.edu>; Thu, 17 Feb 1994 07:32:00 -0800
Received: from GS61.SP.CS.CMU.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA02571; Thu, 17 Feb 1994 07:30:54 -0800
Message-Id: <199402171530.AA02571@Sunburn.Stanford.EDU>
From: Eric Teller <astro@GS61.SP.CS.CMU.EDU>
Date: Thu, 17 Feb 94 10:30:03 EST
To: szabo@netcom.com
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: Nick Szabo's message of Thu, 17 Feb 1994 01:21:31 -0800 (PST) <199402170921.BAA03183@netcom9.netcom.com>
Subject: GP & Induction
Status: RO


Nick (and the rest of GP land)

I recently published a paper which touches briefly on some of these topics.
It's more speculative than rigorous, but its still got a little meat.

"Genetic Programming, Indexed Memory, the Halting Problem, and Other 
Curiosities"
(7th Annual Florida Artificial Intelligence Research Symposium)

If you would like a copy, send me your physical address and I'd be happy
to forward you a copy.

: )

Astro Teller.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 15 18:32:38 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA05885
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 15 Feb 1994 18:32:36 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id PAA28273 for <Genetic-Programming@list.stanford.edu>; Tue, 15 Feb 1994 15:34:56 -0800
Received: from crl.crl.com (crl.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA21057; Tue, 15 Feb 1994 15:33:51 -0800
Received: by crl.crl.com id AA23690
  (5.65c/IDA-1.5 for genetic-programming@cs.stanford.edu); Tue, 15 Feb 1994 15:39:59 -0800
Date: Tue, 15 Feb 1994 15:39:59 -0800
From: Nils Rognerud <nils@crl.com>
Message-Id: <199402152339.AA23690@crl.crl.com>
To: genetic-programming@cs.stanford.edu
Status: RO

I  wish to add a small correction to my last "slam" with the
title GP-QUICK in my earlier E-mail:

1.   I  apologize to anyone who was offended by  my  earlier
comments  (Those who know me have learned to accept  that  I
may  speak very direct and without social veneer - and learn
to take it with a smile. I hope you can to.)

2.   The  criticism was mainly caused by my own laziness  of
wanting  to use the GP technology without having to  do  any
work.

3.   The  criticism  was  not  aimed  specifically  at  Andy
Singleton or his GPQUICK source code.  It was just a general
reaction to finding that the code (not Andy's) was generally
buggy and/or would not compile under Microsoft Windows NT 32
or 16 bit compiler.

I  have agreed to share results of my own work with Andy and
hope others are interested to continue the same.

I  also  believe the degree of non-confront on a problem  is
directly  measurable  by the degree  of  complexity  in  the
solution.   In other words; the complex the solution  -  the
less  likely it is that the author understands what is going
on.

Best,

Nils Rognerud
P.S.  Let's not be to serious guys (smile).  We only live  a
short time, and there is no need to develop ulcers.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 15 17:09:40 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA02572
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 15 Feb 1994 17:09:38 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id OAA28034 for <Genetic-Programming@list.stanford.edu>; Tue, 15 Feb 1994 14:09:39 -0800
Received: from tramp.cc.utexas.edu by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15958; Tue, 15 Feb 1994 14:08:34 -0800
Received: by tramp.cc.utexas.edu id AA01162
  (5.65c/IDA-1.4.4 for genetic-programming@cs.stanford.edu); Tue, 15 Feb 1994 16:08:21 -0600
From: Jim McCoy <mccoy>
Message-Id: <199402152208.AA01162@tramp.cc.utexas.edu>
Subject: Re: GPQUICK
To: nils@crl.com (Nils Rognerud)
Date: Tue, 15 Feb 1994 16:08:20 -0600 (CST)
Cc: genetic-programming@cs.stanford.edu
In-Reply-To: <199402151910.AA18747@crl.crl.com> from "Nils Rognerud" at Feb 15, 94 11:10:34 am
X-Mailer: ELM [version 2.4 PL21]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 870       
Status: RO

Begin Rant.

Nils Rognerud <nils@crl.com> writes:
> 
> I hate to be a little negative, but the GP source code in C/C++ that
> I have found on the net, is less just less than average quality. [I wrote
> my own and will sell them to you.]
>

And your answer is "Nyah. Nayh. I have something better and you can't have
it."?  Gee. Thanks.  My heart brims over with your kind words and
thoughtful gestures... 

People usually write code and put it on the net because it works for them
and often anything is better than nothing.  I do not make my living writing
software, in fact my degree is in law and politics, so the code I donate
the the net for various purposes is generally ugly, semi-buggy, and
sometimes unportable; BFD.

IMNSHO, if you didn't pay for it what have you got to complain about?

> I'll get down from my soapbox now.

You mean billboard.

End rant.

jim

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 15 13:54:00 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA23832
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 15 Feb 1994 13:53:56 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id LAA27583 for <Genetic-Programming@list.stanford.edu>; Tue, 15 Feb 1994 11:05:33 -0800
Received: from crl.crl.com (crl.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA04072; Tue, 15 Feb 1994 11:04:28 -0800
Received: by crl.crl.com id AA18747
  (5.65c/IDA-1.5 for genetic-programming@cs.stanford.edu); Tue, 15 Feb 1994 11:10:34 -0800
Date: Tue, 15 Feb 1994 11:10:34 -0800
From: Nils Rognerud <nils@crl.com>
Message-Id: <199402151910.AA18747@crl.crl.com>
To: genetic-programming@cs.stanford.edu, keithm@icd.ab.com
Subject: Re:  GPQUICK
Cc: keithm%iccgcc.DNET@odin.icd.ab.com, p00396@pslink.com
Status: RO

I hate to be a little negative, but the GP source code in C/C++ that
I have found on the net, is less just less than average quality.

I have dicussed this with other GP'er as well, and we even found some
code from a university in England that has plain bugs (dangling pointers)
to mention just one item that is totally nuts!.

I can understand people who wish to share their work, but I wish they
wouldn't write so much bad code and treat software engineering like
an after-thought or a side-line.  Software can be done in a real simple
and beutifull manner that other people will embraze.  I make a living
writing software and I know it can be done (smile).

I decided to write my own GP routines in C++ using a pure OOD/OOP approach.
It is clean, mean and fast!  (I know I'm not supposed to promote commercial
interests on the net, but please see me privatly for my consulting rates).

I'll get down from my soapbox now.

Best,

Nils Rognerud

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 15 12:39:18 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA20305
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 15 Feb 1994 12:39:14 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id JAA27395 for <Genetic-Programming@list.stanford.edu>; Tue, 15 Feb 1994 09:41:08 -0800
Received: from cra.cra.canon.com ([146.184.10.8]) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA29092; Tue, 15 Feb 1994 09:39:51 -0800
Received: by cra.cra.canon.com (5.65/1.35)
	id AA13768; Tue, 15 Feb 94 09:49:32 -0800
Date: Tue, 15 Feb 94 09:49:32 -0800
From: lanre@cra.canon.com (Lanre Amos)
Message-Id: <9402151749.AA13768@cra.cra.canon.com>
To: genetic-programming@cs.stanford.edu, keithm@icd.ab.com
Subject: Re:  GPQUICK
Cc: keithm%iccgcc.DNET@odin.icd.ab.com, p00396@pslink.com
Status: RO


Mike, I got it to run under VC++/32 on NT without any trouble. I just commented out values.h and defined my own value for MAXFLOAT.
you may be trying to do it under windows 3.1 in which case, i can't be of much help.

-L.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Tue Feb 15 08:22:42 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA09430
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Tue, 15 Feb 1994 08:22:35 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id FAA27164 for <Genetic-Programming@list.stanford.edu>; Tue, 15 Feb 1994 05:25:25 -0800
Received: from odin.icd.ab.com by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA17591; Tue, 15 Feb 1994 05:24:18 -0800
Received: from gadwal.icd.ab.com (gadwal.icd.ab.com [130.151.132.71]) by odin.icd.ab.com (8.1C/5.6) with SMTP id IAA22436; Tue, 15 Feb 1994 08:24:15 -0500
Date: Tue, 15 Feb 1994 08:24:15 -0500
From: "Mike J. Keith" <keithm@icd.ab.com>
Message-Id: <199402151324.IAA22436@odin.icd.ab.com>
To: genetic-programming@cs.stanford.edu
Subject: GPQUICK
Cc: keithm%iccgcc.DNET@odin.icd.ab.com, p00396@pslink.com
Status: RO

Has anyone tried running GPQUICK under a microsoft environment ??

I built it under visual C++ and got system fault errors. There is a
bunch of type conversion warnings like on lrandom() which is supposed
to return a long but is internally casted to return a double ??

Also, there was reference to the alloc.h which I replaced with
malloc.h ??

Mike

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 13 20:21:07 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA02629
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 13 Feb 1994 20:21:06 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id RAA23936 for <Genetic-Programming@list.stanford.edu>; Sun, 13 Feb 1994 17:34:33 -0800
Received: from worldlink.worldlink.com (worldlink.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA19131; Sun, 13 Feb 1994 17:33:27 -0800
Received: by worldlink.worldlink.com (5.65b/4.0.071791-Worldlink)
	id AA21276; Sun, 13 Feb 94 20:32:39 -0500
Message-Id: <2970271206.2.p00396@psilink.com>
In-Reply-To: <9402140012.AA09933@maui.cs.ucla.edu>
Date: Sun, 13 Feb 94 19:39:37 -0500
To: "Robert Collins" <rjc@CS.UCLA.EDU>
Cc: "GP list" <genetic-programming@cs.stanford.edu>
From: "Andrew Singleton" <p00396@psilink.com>
Organization: Creation Mechanics
Subject: Re: co-routines, Artificial ant
X-Mailer: PSILink-DOS (3.4)
Status: RO

... Efficiency stuff

>Several problems with this paragraph.  First, the 10 hours is wrong.  While
>some of our early runs may have taken 10 hours, I optimized the code quite
>a bit and got that down to 2-3 minutes per run to find an optimal solution.
>Second, Koza (and now all GPers) use a much easier problem statement, because
>you give the ants plenty of time.  In the original problem statement, we
>gave the ants 5 bits of memory and only 200 ticks to get the job done.
>This requires quite a bit of trail-specific behavior to be "built into"
>the ant's program.  Much harder than finding the simple algorithm needed
>for the Koza formulation of the problem.  Third, you compare efficiency
>of your GP solution to FSA and ANN GA solutions on different problems.

I was quoting Koza, not Collins or even Singleton.  I agree completely
that the formulation for 200 time ticks is MUCH harder than the
formulation for 600 ticks.  I didn't bring this up because the problems
are not comparable in a number of ways, including criteria for
"solution", which seems to be unique to Koza.  I was merely continuing
the (poorly analogous) comparison, without realizing that it was a
contentious  subject.  If you check Koza's A-life II paper, you will
find that I paraphrased it ACCURATELY, perhaps at the expense of an
accurate analysis (or even, apparently, accurate facts).  The ant
problem is used as a convenient standard benchmark against Koza's work
and problem formulation, so I felt obliged to reference his article
rather than your article in the same book.

I am quite pleased to hear that your version of the problem eventually
ran in 2-3 minutes (comparable to our average wait time, but for a much 
harder problem).  This demonstrates that there is considerable scope
for improvement in most algorithms.

I apologize for any offense.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 13 19:03:44 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA00846
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 13 Feb 1994 19:03:42 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id QAA23894 for <Genetic-Programming@list.stanford.edu>; Sun, 13 Feb 1994 16:13:47 -0800
Received: from Maui.CS.UCLA.EDU by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA18118; Sun, 13 Feb 1994 16:12:42 -0800
Received: by maui.cs.ucla.edu
	(Sendmail 5.61d+YP/3.23) id AA09933;
	Sun, 13 Feb 94 16:12:37 -0800
Date: Sun, 13 Feb 94 16:12:37 -0800
From: rjc@CS.UCLA.EDU (Robert Collins)
Message-Id: <9402140012.AA09933@maui.cs.ucla.edu>
To: p00396@psilink.com, smaxwell@borland.com
Subject: Re: co-routines, Artificial ant
Cc: genetic-programming@cs.stanford.edu
Status: RO

>From genetic-programming-owner@list.Stanford.EDU Sun Feb 13 13:36:34 1994
>
>ASIDE: In A-Life II, Koza wrote that Jefferson and Collins et. al. used
>10 hours of CM2 supercomputer time to find a solution for the Santa Fe
>trail using a bit string GA.  He noted that it would take about 3 hours
>of TI Explorer time to find the solution with 99% probability, a
>significant improvement in efficiency.  The experiment described above
>imitates the Koza experiment in most details, but requires about 10
>minutes on a 486 to get to 99% probability.  This does go to show that
>genetic programming efficiency is on the same exponential improvement
>curve as other computing technology.  If we could get this improvement
>in problem complexity, as well as efficiency, we would really be in
>business.

Several problems with this paragraph.  First, the 10 hours is wrong.  While
some of our early runs may have taken 10 hours, I optimized the code quite
a bit and got that down to 2-3 minutes per run to find an optimal solution.
Second, Koza (and now all GPers) use a much easier problem statement, because
you give the ants plenty of time.  In the original problem statement, we
gave the ants 5 bits of memory and only 200 ticks to get the job done.
This requires quite a bit of trail-specific behavior to be "built into"
the ant's program.  Much harder than finding the simple algorithm needed
for the Koza formulation of the problem.  Third, you compare efficiency
of your GP solution to FSA and ANN GA solutions on different problems.

Sorry, but the 7 year old CM is still 5 times faster than your PC, even
with you using an easier problem.

rob

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 13 15:56:55 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA26174
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 13 Feb 1994 15:56:52 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id NAA23787 for <Genetic-Programming@list.stanford.edu>; Sun, 13 Feb 1994 13:19:47 -0800
Received: from worldlink.worldlink.com (worldlink.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA15596; Sun, 13 Feb 1994 13:18:42 -0800
Received: by worldlink.worldlink.com (5.65b/4.0.071791-Worldlink)
	id AA15636; Sun, 13 Feb 94 16:17:56 -0500
Message-Id: <2970255423.3.p00396@psilink.com>
In-Reply-To: <m0pSD8j-0004uqC@genghis.borland.com>
Date: Sun, 13 Feb 94 15:28:17 -0500
To: "Sid Maxwell" <smaxwell@borland.com>
Cc: "GP list" <genetic-programming@cs.stanford.edu>
From: "Andrew Singleton" <p00396@psilink.com>
Organization: Creation Mechanics
Subject: Re: co-routines, Artificial ant
X-Mailer: PSILink-DOS (3.4)
Status: RO

I checked the hypothesis that the artificial ant problem benefits from
cutting off evaluation of the dud individuals.  I ran the problem with
the standard GPQUICK steady state GA and a simple cutoff rule: if the
score (food found) at 100 ticks is less than half the best-so-far at 100
ticks, the evaluation terminates and returns the current score.

This produced runs with only 54% percent as much eval time.  They are
twice as fast because about half of the individuals get killed at the
cutoff.  It produced a slight degradation in success probability after
25,000 evals with a population 2000, but still was a noticeable benefit
in wall clock time.  With a population size of 5000, for comparison
with your better runs, I used 51% as much eval time per individual (the
maximums are better), and succeeded 60% of the time in 25,000 
generates.  This would require me to generate 125,000 individuals for a 
99% probability of success, with an "individual equivalent" effort of 
about 63,750.  This compares well with the co-routine result of 53,508 
minimum effort for population 5000.

This confirms my hypothesis that co-routines derive their benefits from:
1) Steady state reproduction;
2) Fewer evaluations on bad individuals;
and that these effects can be applied separately or together.  
Together, they give about the same results as the co-routine algorithm.

This is a useful finding because we can apply these enhancements 
directly to a variety of genetic algorithms, as well as to co-routines.


If you want my artificial ant implementation for GPQUICK, let me know.

ASIDE: In A-Life II, Koza wrote that Jefferson and Collins et. al. used
10 hours of CM2 supercomputer time to find a solution for the Santa Fe
trail using a bit string GA.  He noted that it would take about 3 hours
of TI Explorer time to find the solution with 99% probability, a
significant improvement in efficiency.  The experiment described above
imitates the Koza experiment in most details, but requires about 10
minutes on a 486 to get to 99% probability.  This does go to show that
genetic programming efficiency is on the same exponential improvement
curve as other computing technology.  If we could get this improvement
in problem complexity, as well as efficiency, we would really be in
business.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Sun Feb 13 13:40:39 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA22832
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Sun, 13 Feb 1994 13:40:37 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id KAA23712 for <Genetic-Programming@list.stanford.edu>; Sun, 13 Feb 1994 10:50:57 -0800
Received: from worldlink.worldlink.com (worldlink.com) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13475; Sun, 13 Feb 1994 10:49:44 -0800
Received: by worldlink.worldlink.com (5.65b/4.0.071791-Worldlink)
	id AA12237; Sun, 13 Feb 94 13:48:52 -0500
Message-Id: <2970245865.0.p00396@psilink.com>
In-Reply-To: <m0pSD8j-0004uqC@genghis.borland.com>
Date: Sun, 13 Feb 94 12:58:53 -0500
To: "Sid Maxwell" <smaxwell@borland.com>
Cc: "GP list" <genetic-programming@cs.stanford.edu>
From: "Andrew Singleton" <p00396@psilink.com>
Organization: Creation Mechanics
Subject: Re: co-routines / Art ant
X-Mailer: PSILink-DOS (3.4)
Status: RO

Your results on the artifical ant problem were fascinating, and I was
impressed that you got the effort down to 32,000 individual equivalents.

In this communication I propose an explanation for these results, and I 
also propose an alternative algorithm which simplifies implementation.

On analyzing your co-routine idea, I realized that it contained three 
possible sources of advantage when compared with the standard 
generational GA.  The three advantages are:

1) Steady state.  A small percentage are replaced each time.
2) Fewer evaluations on the duds.
3) Earlier reproduction of the good guys.

I hypothesized that all of the advantage comes from 1 and 2.  The 
advantages of 3 are counterbalanced by the possibility of a dead 
end.  Furthermore, I have found steady state algorithms to be better on 
many problems, and I have found that killing the duds early can save a 
lot of evaluations.

We can test these possibilities separately.  I have implemented the
artificial ant with GPQUICK, and I did 100 runs on the 89 piece Santa
Fe trail, as described in the Koza article for A-Life II (my copy of
the Koza GP tome is otherwise engaged).  I used the same function set,
crossover based GP, and a population of 2000.  The only significant
difference was that I used a steady state GA, and ran it for 25,000
individuals.

(as an aside, the number of ticks allowed was very important.  I
allowed 600 ticks as specified by Rice in an earlier communication.
More than half of the runs that failed scored 88, and simply ran out of
time before finding the last piece.  More ticks give significantly
higher success rates.)

These runs succeeded in finding all 89 pieces 57% of the time, 
requiring about 5.5 runs to 25,000, or 137,000 individuals, to solve 
with 99% probability.  I did not tune this for minimum effort, but I 
think this is in the ballpark.

1) Steady state gains?

Looking at the graph in the Koza article, I noticed that for population 
size 2000, the probability of success at generation 13 (26,000 
individuals) is about 45%, and that the probability of success reaches 
57% at generation 25 (50,000 individuals), so the steady state GA is 
about 1.5 to 2 times as efficient as the generational GA.

You found a minimum effort of 96,683 individual equivalent evaluations
with a population of 2000.  Given the differences in minimum effort
calculations, this is not very different from my result of 137,000 for
the basic steady state GA.  So I attribute much of the gain in
co-routines to steady state behavior.  The fact that the gain is much
better at smaller replacement rates (closer to steady state) is very
telling.

2) Fewer evaluations of the duds?

I have not tested the possibility of doing fewer ticks/fitness cases on
the duds, but in other problems I have found that this can reduce effort
by a factor of 2 to 4, depending on how ruthless you are about killing
the laggards.  In some problems, this is easy because the duds have
scores near zero.  However, as the problem advances there is a limit to
how ruthless you can be,  because you have to get a statistically
significant number of  ticks/cases before you can kill a marginal
individual.  If you can be really ruthless (better than a factor of 4
improvement), you are probably just doing to many fitness cases.

I propose an alternative to the co-routines algorithm which is perhaps
easier to implement, does not require synchronized evals, and will fit
into an existing steady state GA.

1) Run a steady state GA, selecting one individual for replacement each time.

2) Keep for each tick/fitness case
	An exponential moving average of the past scores at that point
	or, a list of the past <n> scores at that point

3) With some probability on each tick, kill an individual if it is 
below average or low in the score ranking.  Give it a low score and 
move on to the next replacement.

Some problems are even easier, because you can set an early cutoff score and 
just abandon anyone who falls below it.

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Fri Feb 11 09:21:35 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA08819
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Fri, 11 Feb 1994 09:21:32 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id FAA21269 for <Genetic-Programming@list.stanford.edu>; Fri, 11 Feb 1994 05:59:28 -0800
Received: from io.salford.ac.uk by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA13678; Fri, 11 Feb 1994 05:53:36 -0800
Message-Id: <199402111353.AA13678@Sunburn.Stanford.EDU>
Received: from mailgate-0.salford.ac.uk by io.salford.ac.uk with SMTP (PP);
          Fri, 11 Feb 1994 13:53:24 +0000
From: A.Fraser@eee.salford.ac.uk
Date: 11 Feb 94 24:03
To: genetic-programming@cs.stanford.edu
X-Mailer: University of Salford cc:Mail/SMTP gateway 1.65
Encoding: 25 TEXT
Status: RO

     Dear GPer's,
     
     
        I am again working at developing my genetic programming in C++ 
     software ( ftp.cc.utexas.edu archive and at SAFIER ).  While I have 
     already implemented nearly all the functionality that I require to 
     complete my PhD I would like to produce a GP system with the same 
     broad ability of the genesis system for genetic algorithms.  I 
     therefore would like to ask the what functions and procedures they 
     would like to see in a GP system.  The GPC++ code already has real and 
     floating point number capability, ADFs (just), tournament and standard 
     selection so what other components are necessary.  Please unless you 
     think your ideas will inspire the whole list could you mail me direct 
     as the s/n ratio is excellent compared to other lists.
     
     Though this is also probably the wrong place for this, is anyone 
     looking for a ( fairly ) good programmer who would love to continue in 
     genetic programming but after PhD land finishes ( June 1995 ) has very 
     little chance of doing so.
     
     
     thanks in advance,
     
                Adam Fraser
     

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 10 06:45:22 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA26717
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 10 Feb 1994 06:45:04 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id DAA18448 for <Genetic-Programming@list.stanford.edu>; Thu, 10 Feb 1994 03:38:31 -0800
Received: from cs.few.eur.nl (kaa.cs.few.eur.nl) by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA21565; Thu, 10 Feb 1994 03:37:13 -0800
Received: from platina.cs.few.eur.nl by cs.few.eur.nl (5.67/EUR)
	id AA26193; Thu, 10 Feb 94 12:37:04 +0100
From: Bernard Manderick <manderic@cs.few.eur.nl>
Received: by platina.cs.few.eur.nl (5.67/EUR/BSD)
	id AA05987; Thu, 10 Feb 94 12:37:02 +0100
Date: Thu, 10 Feb 94 12:37:02 +0100
Message-Id: <9402101137.AA05987@platina.cs.few.eur.nl>
To: genetic-programming@cs.stanford.edu
Subject: Call for participation
Cc: gusz@cs.vu.nl, zsofi@cs.vu.nl, manderic@cs.few.eur.nl
Status: RO

Dear moderator,

Could you post the call below in the next issue of your digest.

	Many thanks in advance,

	Bernard

_______________________________________________________________________________
	Bernard Manderick (Room H4-09) tel.  +31/10/408.18.53         
	Dept. of Computer Science      fax   +31/10/452.61.77 
	Faculty of Economics           email manderick@cs.few.eur.nl
	Erasmus University Rotterdam
	P.O. Box 1738
	3000 DR Rotterdam
	The Netherlands	
_______________________________________________________________________________
	
	
CALL FOR PARTICIPATION

ECAI-94 WORKSHOP ON 
APPLIED GENETIC AND OTHER EVOLUTIONARY ALGORITHMS
Amsterdam, August 9, 1994

One of the reasons for the growing interest in genetic and other 
evolutionary algorithms is their good performance on a wide scale 
of problems. However, practical applications may raise issues 
beyond the scope of classical models. The goal of this workshop is 
to cumulate knowledge on the application of EAs and in particular 
to study changes to and extensions of the standard approaches. 
Topics of interest include but are not restricted to:

o problem elicitation and representation,
o non-standard genotypes and recombination operators,
o genetic programming,
o handling constraints,
o boosting performance,
o combination with other techniques, e.g. local search, neural nets,
  knowledge based systems,
o advantages and disadvantages of EAs w.r.t. other techniques.

About 10-12 of the participants will have the opportunity to 
introduce his/her work in the form of a short presentation. Other 
persons interested in the subject may also participate in a limited
number. Besides the presentations, substantial time will be allocated
for discussion and comparison of the presented results. Our hope is 
that insights gained at the workshop may facilitate further 
applications and support new theory. 

SUBMISSIONS
Two kinds of contributions are invited 

o papers that describe successful practical applications,
o papers investigating relevant issues by extensive test sessions on 
  a test bench.

In both cases the problem(s) to be solved and the issue(s) to be 
investigated should be clearly described followed by the system 
description and  experiment setup. Evaluation of the system and 
analysis of the test results should be given to support the 
conclusions.

Three camera ready copies of a full paper not exceeding 12 pages 
(12 point font, single space) including figures and references 
should be sent to

              A.E Eiben 
              Artificial Intelligence Group
              Dept. of Maths. and Comp. Sci.
              Free University Amsterdam
              De Boelelaan 1081a
              1081 HV Amstredam
              The Netherlands
              email: ecai-ga@cs.vu.nl

Submission of the PostScript format by e-mail is also possible. 
Accepted papers will be included in the workshop proceedings 
available at the workshop. In the meantime, the organizers aim 
at publication in a special journal issue or book containing the 
revised versions of the best papers.

Persons willing to attend the workshop without a presentation
should submit a brief description of their research area or field 
of interest. Deadlines for submission and notification are the same 
as for papers for presentation.


TIME TABLE
Deadline for submission: April 25, 1994
Notification of acceptance: May 30, 1994
Workshop: August 9, 1994


ORGANIZERS: 

A.E Eiben 
Artificial Intelligence Group
Dept. of Maths. and Comp. Sci.
Free University Amsterdam
De Boelelaan 1081a
1081 HV Amstredam
The Netherlands

Phone: +31-(0)20-5482997
Fax: +31-(0)20-6427705
email: gusz@cs.vu.nl


B. Manderick
Computer Science Department 
Erasmus University Rotterdam 
Burg. Oudlaan 50
3062 PA Rotterdam
The Netherlands

Phone: +31-(0)10-4081853
Fax: +31-(0)10-452 61 77
email: manderic@cs.few.eur.nl


Zs. Ruttkay 
Artificial Intelligence Group
Deptartment of Mathematics and Computer Science 
Vrije Universiteit Amsterdam 
De Boelelaan 1081a 
1081 HV Amsterdam 
The Netherlands

Phone: +31-(0)20-5482412
Fax: +31-(0)20-6427705
email: zsofi@cs.vu.nl

------------------------------------------------------------------------------

From genetic-programming-owner@list.Stanford.EDU Thu Feb 10 06:42:48 1994
Received: from list.Stanford.EDU by ccwf.cc.utexas.edu with SMTP id AA26646
  (5.65c/IDA-1.4.4 for <McCoy@ccwf.cc.utexas.edu>); Thu, 10 Feb 1994 06:42:46 -0600
Received: from Sunburn.Stanford.EDU (Sunburn.Stanford.EDU [36.8.0.178]) by list.Stanford.EDU (8.6.4/8.6.4) with SMTP id DAA18441 for <Genetic-Programming@list.stanford.edu>; Thu, 10 Feb 1994 03:20:35 -0800
Received: from waldorf.Informatik.Uni-Dortmund.DE by Sunburn.Stanford.EDU with SMTP (5.67b/25-SUNBURN-eef) id AA20892; Thu, 10 Feb 1994 03:19:16 -0800
Received: from trurl.informatik.uni-dortmund.de
	by waldorf.informatik.uni-dortmund.de with SMTP (Sendmail 8.6.5/UniDo 2.0.13)
        id MAA25024; Thu, 10 Feb 1994 12:19:08 +0100
From: Robert Keller <keller@trurl.informatik.uni-dortmund.de>
Date: Thu, 10 Feb 94 12:19:06 +0100
Message-Id: <9402101119.AA01648@trurl.informatik.uni-dortmund.de>
Received: by trurl.informatik.uni-dortmund.de id AA01648; Thu, 10 Feb 94 12:19:06 +0100
To: genetic-programming@cs.stanford.edu
Subject: more problems?
Cc: keller@trurl.informatik.uni-dortmund.de
Status: RO

Hi,


Koza's book indeed covers a wide range of typical problems without
exact mathematical solution. Does somebody know of a problem collection
(as software, hardcopy etc) containing problems of the a.m type that
the book does NOT deal with?

thanx

Robert

Robert E. Keller		Email: keller@LS11.Informatik.Uni-Dortmund.DE
				phone: +49-231-755-2107             ___
Dortmund University		                  -4591            ////
Computer Science Department	fax:   +49-231-755-2450       UNI DO//
Chair of Systems Analysis	                              \*\\///
44221 Dortmund, Germany	                                       \\\\/ 
_____________________________________________________________________________