Meta-Optimizing Semantic Evolutionary Search
Meta-optimizing semantic evolutionary search (MOSES) is a new approach to program evolution, based on representation-building and probabilistic modeling. MOSES has been successfully applied to solve hard problems in domains such as computational biology, sentiment evaluation, and agent control. Results tend to be more accurate, and require less objective function evaluations, than other program evolution systems, such as genetic programming or evolutionary programming . Best of all, the result of running MOSES is not a large nested structure or numerical vector, but a compact and comprehensible program written in a simple Lisp-like mini-language.
A discussion of how MOSES fits into the grand scheme of OpenCog is given on the OpenCogPrime:Probabilistic Evolutionary Learning Overview page.
MOSES performs supervised learning, and thus requires either a scoring function or training data to be specified as input. As output, it generates a Combo program that, when executed, approximates the scoring function. MOSES uses general concepts from evolutionary search, in that it maintains a population of programs, and then explores the neighborhood of modified, "mutated" programs, evaluating their fitness. After some number of iterations, the fittest program found is output.
More precisely, MOSES maintains a population of demes. Each deme is a program with many adjustable, tuneable parameters. These adjustable parameters are colloquially referred to as knobs. Thus, finding the fittest program requires not only selecting a deme, but also determining the best setting for the knobs.
The MOSES algorithm proceeds by selecting a deme and performing random mutations on it, by inserting new knobs in various random places. The best-possible knob settings for the mutated deme are found by using using existing, well-known optimization algorithms such as hill-climbing, simulated annealing or estimation of distribution algorithms (EDA) such as Bayesian optimization (BOA/hBOA). The fitness of the resulting program(s) can be compared to the fittest exemplar of other demes. If the new program is found to be more fit, it is used to start a new deme, (while discarding the old, dominated deme) and the process then repeats.
All program evolution algorithms tend to produce bloated, convoluted, redundant programs ("spaghetti code"). To avoid this, MOSES performs reduction at each stage, to bring the program into normal form. The specific normalization used is based on Holman's "elegant normal form", which mixes alternate layers of linear and non-linear operators. The resulting form is far more compact than, say, for example, boolean disjunctive normal form. Normalization eliminates redundant terms, and tends to make the resulting code both more human-readable, and faster to execute.
The above two techniques, optimization and normalization, allow MOSES to out-perform standard genetic programming systems. The EDA algorithms, by finding the dependencies in a Bayesian network, in fact are able to find how different parts of a program are related. This quickly rules out pointless mutations that change one part of a program without making corresponding changes in other, related parts of the program. The other important ingredient, reduction to normal form, allows programs to become smaller, more compact, faster to execute, and more human readable. Besides avoiding spaghetti code, normalization removes redundancies in programs, thus allowing smaller populations of less complex programs, speeding convergence.
The programs that MOSES generates are "generic", in the sense that MOSES works with structured trees, represented in Combo. Such trees can represent propositional formula, procedural or functional programs, etc. The core MOSES solver is written in C++, and takes the form of a library. There are many example programs illustrating how to use this library.
OpenCog includes a program for performing feature selection or dimensional reduction. In machine learning problems (and MOSES s no exception) there are often a lot of variables that simply will not contribute to the final result. Eliminating these at the outset will improve the speed and accuracy of the machine learning system. The feature-selection program is tailored to MOSES for this use.
- MOSES terminology, vocabulary needed to understand the documentation.
- MOSES algorithm, a brief overview of the main algorithm.
- MOSES man page for the command line tool.
- File:MOSES-QuickGuide.pdf provides an overview, including vocabulary, a summary of the algorithm, and a survey of the implementation internals, the various C++ classes and methods of the implementation. (TeX sources are in the opencog source tree, located in doc/moses/QuickGuide.tex)
- Moshe Looks, (2006), "Competent Program Evolution", PhD thesis.
- Moshe Looks, Ben Goertzel, (2008), "A Representation of Programs for Learning and Reasoning"
- The OpenCog Brainwave blog entry Tuning MOSES offers some insight into how the algorithm is actually behaving.
- Modelling Atom Spaces, or Continuous and Hierarchical Optimization with Estimation of Distributions and Occam Bias
- MOSES has both a distributed, and a multi-threaded mode.
- A tutorial with emphasis on Distributed MOSES.
A WIN32 binary compiled via Cygwin is available here.
A rewrite in LISP called PLOP is in development.