Meta-Optimizing Semantic Evolutionary Search

From OpenCog

Meta-optimizing semantic evolutionary search (MOSES) is a new approach to program evolution, based on representation-building and probabilistic modeling. MOSES has been successfully applied to solve hard problems in domains such as computational biology, sentiment evaluation, and agent control. Results tend to be more accurate, and require less objective function evaluations, than other program evolution systems, such as genetic programming or evolutionary programming . Best of all, the result of running MOSES is not a large nested structure or numerical vector, but a compact and comprehensible program written in a simple Lisp-like mini-language.

A discussion of how MOSES fits into the grand scheme of OpenCog is given on the OpenCogPrime:Probabilistic Evolutionary Learning Overview page.

Example

The easiest way to explain MOSES is by means of a simple example. Suppose one has a truth table of binary values:

Scoring Table
col1 col2 rslt
F F F
F T F
T F F
T T T

Then, given this table as input, the MOSES system is able to induce a short boolean expression for this table, namely, that the column rslt is given by

(col1 logical-AND col2) 

That is, MOSES was able to "learn" the "program" (col1 logical-AND col2) from this input.

This example is over-simplified: MOSES has been used with tables up to 300 columns wide and hundreds of thousands of rows long. The data does not have to be presented as a table, it can be given as a function. Column values need not be just booleans, they can also be floating-point numeric values, or symbolic values (strings). Operations include not only the boolean operators AND,OR,NOT, but also the arithmetic operators + - * / (plus, minus, times, divide) and the relation operators such as less-than and equals. Experimental prototype versions of MOSES have explored differential equations and other more complex structures.

In general, it is not possible to fit a large, complex table with a small, short program fragment. Thus, MOSES allows a "complexity penalty" to be specified, so that the resulting programs are as small as reasonable, and are still reasonably accurate. More than one output program is generated: these are presented in order of accuracy (for example, precision and recall). Assorted pre-processing tools are able to eliminate columns that are mostly duplicates of one-another, thus reducing the search space.

The original MOSES is a stand-alone tool suite. The ASMOSES project is a port of MOSES to the AtomSpace, so that it can both work on datasets contained in the AtomSpace, and also store results in the AtomSpace.

Overview

MOSES performs supervised learning, and thus requires either a scoring function or training data to be specified as input. As output, it generates a Combo program that, when executed, approximates the scoring function. MOSES uses general concepts from evolutionary search, in that it maintains a population of programs, and then explores the neighborhood of modified, "mutated" programs, evaluating their fitness. After some number of iterations, the fittest program found is output.

More precisely, MOSES maintains a population of demes (a population of programs derived from one single representation). Each deme is a program with many adjustable, tuneable parameters. These adjustable parameters are colloquially referred to as knobs. Thus, finding the fittest program requires not only selecting a deme, but also determining the best setting for the knobs.

The MOSES algorithm proceeds by selecting a deme and performing random mutations on it, by inserting new knobs in various random places. The best-possible knob settings for the mutated deme are found by using existing, well-known optimization algorithms such as hill-climbing, simulated annealing or estimation of distribution algorithms (EDA) such as Bayesian optimization (BOA/hBOA). The fitness of the resulting program(s) can be compared to the fittest exemplar of other demes. If the new program is found to be more fit, it is used to start a new deme, (while discarding the old, dominated deme) and the process then repeats.

All program evolution algorithms tend to produce bloated, convoluted, redundant programs ("spaghetti code"). To avoid this, MOSES performs reduction at each stage, to bring the program into normal form. The specific normalization used is based on Holman's "elegant normal form", which mixes alternate layers of linear and non-linear operators. The resulting form is far more compact than, say, for example, boolean disjunctive normal form. Normalization eliminates redundant terms, and tends to make the resulting code both more human-readable, and faster to execute.

The above two techniques, optimization and normalization, allow MOSES to out-perform standard genetic programming systems. The EDA algorithms, by finding the dependencies in a Bayesian network, in fact are able to find how different parts of a program are related. This quickly rules out pointless mutations that change one part of a program without making corresponding changes in other, related parts of the program. The other important ingredient, reduction to normal form, allows programs to become smaller, more compact, faster to execute, and more human readable. Besides avoiding spaghetti code, normalization removes redundancies in programs, thus allowing smaller populations of less complex programs, speeding convergence.

The programs that MOSES generates are "generic", in the sense that MOSES works with structured trees, represented in Combo. Such trees can represent propositional formula, procedural or functional programs, etc. The core MOSES solver is written in C++, and takes the form of a library. There are many example programs illustrating how to use this library.

OpenCog includes a program for performing feature selection or dimensional reduction. In machine learning problems (and MOSES is no exception) there are often a lot of variables that simply will not contribute to the final result. Eliminating these at the outset will improve the speed and accuracy of the machine learning system. The feature-selection program is tailored to MOSES for this use.

Documentation

  • File:MOSES-QuickGuide.pdf provides an overview, including vocabulary, a summary of the algorithm, and a survey of the implementation internals, the various C++ classes and methods of the implementation. (TeX sources are in the opencog source tree, located in doc/moses/QuickGuide.tex)
  • MOSES for Noobs provides an easy-to-follow tutorial for first time users of MOSES.
  • Nil Geisweiller's lectures on MOSES and Reduct during the AGI summer school in Xiamen.
  • The OpenCog Brainwave blog entry Tuning MOSES offers some insight into how the algorithm is actually behaving.
  • A tutorial with emphasis on Distributed MOSES.\

Code

PLOP

A rewrite in LISP called PLOP also exists.

See also