The MOSES algorithm consists of two nested optimization loops. The outer loop maintains a population of scored program trees, the so-called metapopulation. The inner loop explores a local neighborhood of a given program tree, using a representation centered on an exemplar. When the inner loop finds a reasonable set of candidate programs, these are returned to the outer loop, and merged back into the metapopulation. More precisely, the steps are as follows:
1. Selection step.
Choose one exemplar from the metapopulation. Initially, this will be the empty program, unless the user specified an initial exemplar. The choice is made by considering the entire metapopulation, and picking the program tree with the highest score, that has not been previously explored (There is little point in re-exploring the neighborhood of a previously explored program tree, as all improvements are likely to have already been found). This selection is done by metapopulation::select_exemplar().
Actually, we've found that the algorithm works faster if we don't always use the highest-scoring program tree, but sometimes randomly pick a lower-scoring tree (using a Boltzmann distribution). Something similar is often seen in biology: mutations damage general-purpose mechanisms, making them less fit overall, but more fit in a narrower niche. That is, the path to greater fitness leads through damage.
Given an exemplar, construct a representation; that is, take the exemplar and decorate it with knobs. Build a field set and a knob mapper that will act as a mapping between a linear bit-string, and specific knob settings in the representation. The field set describes the layout of the bit-string; the knob mapper associates fields with knobs. Create an initial instance; that is, a bit-string that can be interpreted as a collection of knob settings, via the field set mapping.
Given a representation, a field set, and an initial instance, invoke the inner optimization loop. One of several different inner optimization loops are possible; they all have the steps below in common. Typically, a collection of scored instances is maintained; this collection is called a deme, and thus the first step is to open a new deme.
3a. Score the initial instance
Evaluate the combo program that results from these specific knob settings, and see how well this program reflects the desired regression output.
3b. Generate new instances
Generate new instances via some method (e.g. one-point mutations, cross-over, etc.) and score these instances in turn. Maintain a collection of scored instances; these are referred to as the deme. New instances are typically neighbors of other instances: they differ from existing instances by just a few knob settings (i.e. differ by only a few one-point mutations).
3c. Terminate Search.
Terminate the search via some exit criteria: lack of improvement, number of allowed evaluations exceeded, maximal neighborhood explored, etc.
4.Close the deme.
This step accepts the list of best-possible instances found in the previous step, and merges them back into the metapopulation. First, convert each instance in the deme back into ordinary program trees (i.e. by fixing knob settings at a set position, thus 'removing' the knobs). Normalize, or reduce each of these to elegant normal form. Merge the resulting programs into the metapopulation, ranking them by score.
Currently, merging is done merely by ranking all program trees by score. A more complex merge, by domination, is also possible (dominated programs are discarded; non-dominated ones are added to the metapopulation). However, the domination algorithm has three major drawbacks: (1) its complex (2) its slow, and, most importantly (3) it eliminates some of the very best exemplars; specifically, as mentioned in the selection step above, sometimes (usually?) the fittest specimens are more easily evolved from 'damaged' exemplars. The unfit, dominated exemplars often contain a structure that can be fine-tuned to become extremely fit, whereas the fit exemplars don't carry such excess baggage.
5. Go to step 1
Repeating until termination criteria are met, such as acheiving perfect score, or exceeding the maximal number of evaluations, etc.
Comparison to Cartesian Genetic Programming
In certain ways, the MOSES algorithm is similar to Cartesian GP (CGP) , employing similar mechanisms. The use of instances in MOSES is identical to the core idea of CGP: an instance is just a string of numbers, and this space of all strings of numbers can be explored in various different ways (by applying hill-climbing, simulated annealing, or Bayesian algorithms to one-point mutations or genetic cross-overs of strings). That is, the MOSES knob-settings are identical to the notion of function-settings in CGP.
MOSES differs from CGP by not having a fixed number of layers or columns: instead, the complexity and depth of the program tree grows over time. Unlike CGP, MOSES does not have 'connection' genes. However, it does have a very similar concept; its just implemented differently. The notion of connection genes is instead replaced by a periodic 'knob decoration' step. It behaves as if the 'connection genes' were explored by MOSES at a slower rate than the function genes. The MOSES steps of selecting an exemplar, decorating it with knobs (and, at the end, removing unused knobs) is closely analogous to exploring different connection gene settings.
Note that representation-building, the optimization algorithm, and normalization are vital steps of the algorithm, and they crucially influence its performance. Representation building is specific for each domain (e.g., learning propositional formulae), while the optimization algorithm is general (it operates only on instances). MOSES currently supports representation building for several problem domains, including propositional formulae, actions, arithmetic formulas, and predicate logic (arithmetic relations embedded in propositional formulae. MOSES also supports several different optimization algorithms, including hill-climbing, simulated annealing and Bayesian optimization. Work on support-vector machine (SVM) optimization is underway. Only one form of program reduction, to elegant normal form, is supported. Other types of reduction, e.g. SAT-based or satisfiability-modulo-theory (SMT) may be possible but remains unexplored.
One interesting idea worth pursuing is to exchange the inner and outer loops. This would allow moses to operate in a 'continuous transfer-learning mode'. Currently, the only way to perform transfer learning within moses is to supply the system with an initial set of exemplars (as opposed to starting with the empty set) and then letting it mutate those. If instead the loop order was exchanged (or, more precisely, united), training would not need to be performed on a closed table of training values, but instead could stream sensor data continuously, from the external environment, maintaining a continuously-evolving set of program trees.