This page provides a list of ideas for changing or enhancing MOSES in various ways, to make it more powerful, more useful or possibly faster.
Most of the content from the GSOC Ideas for MOSES page should be moved here. Almost all of the proposals there need to be worked out in much greater detail, as they are currently hard to understand and evaluate. Most of the proposals below should probably be explained in greater detail as well.
At this time, MOSES is highly tuned for the supervised learning of combo expression trees that model rectangular tables of data. The combo programs describe the dependent column, as a function of the independent columns in the table.
The proposal is to add support for 'virtual columns' in the table. These virtual columns would be described by combo trees: i.e. would be functions of the other columns in the table. Once in place, they would then be treated just like any other columns in the input table.
These columns could be specified 'by hand' by the user, or they could be auto-generated and managed by moses internals.
There are many benefits that such a structure could provide:
- Memoization of common subexpressions. If a given virtual column stores the pre-computed values associated with its combo tree, it can be understood as a memoization of that tree. This could speed performance.
- Differential equation learning. For example, if an input table contained position data obtained at uniform time intervals, then the difference between two rows would be a velocity; the difference of two velocities would be an acceleration. Having these as column values would enable MOSES to learn simple differential questions. Note, however: currently, in combo, there is no way to create a combo expression that says "take the difference between neighboring rows". If such an ability is added, and its also added as a combo vertex function, then MOSES could just use that during knob decoration, and virtual columns would not actually be needed for diff eq learning (although they may speed it up, due to memoization, above).
- Simplified Neural Net support (or SVM, or other linear Kernel methods) A single virtual column could hold the output of a single 'neuron'. This would allow MOSES to learn expressions that combine the outputs of different neurons. This begs the question: who sets the weights for the neurons? With an appropriate API, other, outside systems could set the weights. But possibly MOSES could set the weights itself: In principle, MOSES is capable of learning linearized expressions. In practice, it is currently very slow to learn these. One of the main benefits here is again a 'memoization'.
Thus, virtual columns seem like a good idea, but maybe there are other ways of obtaining similar function.
Hypergraph instead of collection of trees
An alternative way to adress memoization and saving some RAM is to represent the population as hypergraph, akin to a knowledge base in an atomspace, with the difference that the mutual part of an atom would be essentially a column of values instead of a truth value.
Reduct engine based on algebraic properties
Could simplify the code and make it easier to add more operators, or functions as building blocks (like say XOR). Additionally if we can automatically infer those properties via some narrow theorem proving strategie it eliminate the need to specify the properties for new functions.