Basic vocabulary used in MOSES.
An Anki deck of the definitions can be found here:
A program, in MOSES, is a combo program. A combo program is represented as a tree of operators, variables and values. Nodes in the tree may be constants (bits, integers, real numbers, etc.), boolean operators (and, or, etc.), arithmetical operators (+, -, *, etc.), functions (sin, cos, etc.) or logical expressions (if...then...else, etc.), and so on. Arguments to an n-ary function are denoted with '$', so that would be the arguments. Thus, for example, is a program that takes argument (a float pt. number), multiplies it by 0.5, and checks to see if it is greater than zero. The result of this compare is or-ed ( -ed) with argument (a boolean). Although programs may be explicit, as in this example, a program can also be understood to be a representation, together with a particular set of knob settings, as explained below.
An exemplar is a specific program; typically, the fittest one found.
A representation is a parameterized tree structure, representing a particular region of program space, centered around a single program (the exemplar). A representation is derived from the exemplar by inserting additional nodes in various (random) locations. The inserted nodes, however, are not specific values or functions or operators, but are rather place-holders for values/functions to be determined later. Each place-holder may be thought of as a parameter, and is colloquially referred to as a knob. A representation, together with a particular setting of the knobs, is equivalent to a program. During the optimization step in MOSES (see scoring function), the space of possible parameter or knob settings will be heuristically explored, to attempt to locate the best possible settings, i.e. the fittest program.
A representation built (around an exemplar) is given in terms of knobs. A knob is a single dimension of variation relative to an exemplar program tree. It may be discrete or continuous. For example, given the program tree fragment , a continuous knob might be used to vary the numerical constant. So, setting this knob to 0.7 would transform the tree fragment to . A discrete knob with arity()==3 might be used to transform the boolean input . So, setting this knob to 1 might transform the tree to , and setting it to 2 might remove it from the tree (while setting it to 0 would return to the original tree).
The step in the MOSES algorithm where an exemplar is chosen, and a representation is constructed from it.
TODO: Answer: Q) What is actually involved in the representation construction? (answered under representation is the question Q) What is a representation?)
An instance is an array of particular knob settings. For compactness, instances are maintained as strings of bits; the description of which bit-fields correspond to which knob settings are kept in separate structures, the field set and the knob mapper. During optimization (see scoring function), an evolutionary algorithm will pick and choose among many different instances; a single field set and knob mapping suffices to describe them all.
The bits in the instance bit-string are organized into an array of fields. Each field corresponds to a single knob. The field set describes each field in the array: whether it is discrete or continuous, how many settings it may have, etc. The field set is divorced from the representation or any combo program: it is merely a listing of possible knobs, but does not indicate where those knobs are located in the representation.
The nearest neighbors of an instance are those other instances that differ by exactly one knob setting. This is called the neighborhood at distance one. With distance understood as the Hamming distance, one can then consider progressively larger neighborhoods: those that differ by just two knob settings, or three, etc.
A deme is a population of programs derived from one single representation (constructed from an exemplar). Thus, a deme can be thought of as a population of knob settings. During the optimization phase, an optimizer algorithm, such as hill-climbing, simulated annealing, or the Bayesian optimization algorithm is used to work with the population, locating the best possible knob settings for the given representation. In practice, in the actual implementation, a deme is just a set of scored instances. This is because all instances in a deme share the same representation, field set and knob mapping.
MOSES maintains a collection of demes, playing each off the others. This set of demes is referred to as the metapopulation. Pairs of demes are in competition; fitter demes are used to replace less fit demes.
Is there a particular phase in which pairs of demes compete? What is the terminology? and description?
During the optimization phase, candidate programs being explored are scored by a scoring function. The function is specific to the given problem; it returns a value indicating how closely the candidate program matched the desired output. For supervised training (aka regression) problems, the scoring function just returns how closely the candidate program matched the training set. For demonstration problems, the scoring function is typically some well-studied toy problem, such as parity, onemax, santa-fe-trail, etc. Usually, the perfect score is 0 , while worse scores are negative. Fitter programs have higher scores.
One program instance is considered to dominate another if it is better in every way. The concept of domination requires a scoring function that issues not just one grand-total score, but an array of scores. For example, for regression problems, a program instance may be judged on how accurately it provides an output given an input. To test this, one typically provides a table of N input rows, with each row indicating a desired output. The program can then be tested on each row, with the result compared to the desired output value for that row. One program is said to dominate another only if it has a better score on each of the N tests. Typically, two different programs do not dominate one-another: one is better for some input rows, while the other is better at others. In MOSES, both are kept around and further evolved, with the goal of eventually finding a program that dominates all. Programs that are completely dominated are (usually) discarded.
The normalization step of the MOSES algorithm takes a program, and simplifies it, using re-writing rules. The resulting program is said to be in normal form. Thus, for example, can be reduced to just since or-ing with false changes nothing. Similarly, can be normalized to since multiplying by one-half never changes the sign of a number. Likewise, the expression “” can be reduced to , since a value is always equal to itself, and so the if-branch is always taken. Normalization can sometimes eliminate large parts of a program, if they are vacuous or tautological. There are many different types of normalization that are possible; MOSES always normalizes to the so-called elegant normal form. The word reduction is often used as a synonym for normalization.