Generational GP Algorithm
According to “A Field Guide to Genetic Programming”, there are three basic steps to generational, Tree-based GP:
- Generate an initial, stochastic population.
- Iteratively perform selection, genetic operation, and evaluation:
- Evaluate each program (hypothesis) in the current population against the given dataset and determine how well it performed, the value recorded as a fitness score.
- Randomly select programs and compare their fitness scores.
- Apply one of three genetic operators to a copy of the leading program:
- reproduction
- mutation
- crossover
… then move the program into the subsequent generation.
- Evaluate all programs (hypotheses) in each new generation.
- Repeat until the user-defined termination criteria are met.
In GP, each generation is composed of a population of individual programs. Each program is a mathematical function that when executed against the given data, produces a value.
The initial population (generation 0) is composed of randomly constructed programs built upon user-defined arithmetic, trigonometric, or boolean operators, and user-defined operands, variables which represent specific data parameters. The quantity of programs in the initial and all subsequent populations is defined by the user, and remains constant throughout the entire run.
The entire initial population is evaluated, meaning each individual program is executed against the given dataset, producing a single outcome per program. This outcome is measured against the known solution, and recorded as the fitness score. From this initial population, individuals programs are randomly selected for a tournament in which the program with the best fitness score is copied into the subsequent generation, following application of a genetic operator:
- With reproduction, the program is copied without modification.
- With mutation, a portion of the program is randomly modified.
- With crossover, two programs are selected to contribute a portion of their own code to a new program.
Each mutation and crossover might result in a reduction or improvement in the fitness score for each individual program. As with the initial population, all programs in each new generation are evaluated, randomly selected for a tournament, and then evolved for the subsequent generation. Because those programs selected exhibit the highest fitness score, their contribution to the next generation enables an overall improvement. This process continues until the user-defined termination criteria are met.
This page is from the MSc thesis of Kai Staats, “Genetic Programming Applied to RFI Mitigation in Radio Astronomy“