Proposed Code Changes for Feed-forward ANN MOSES
Create a simple evaluation task
Look at the ant evaluator as an example:
The evaluation task could be something simple like XOR to begin with, then once we are using RNNs we could switch to double-pole-balancing, a standard neuroevolution benchmark task.
We may want to disable this initially.
Expansion will have to respect the structure of the feed-forward network. That is, the first k parameters of an ANN-Node must be links to other nodes or input nodes, while the second k parameters of an ANN-Node must be the weight values. (Looks like this will have to be taken care of in representation-building). Furthermore, all trees must terminate in ANN-Input nodes as leaves.
This is currently the most confusing part of MOSES to me, although I'm making strides in understanding it. Basically we need a good way of knobifying and expanding the ANN representation.
So I've been looking at the continuous knob building and logical knob building, both which take different approaches. I suppose the approach we take would be closer to the continuous, since basically our representation is weight-centric.
Basically, the expansion could go that given an existing ANN tree, we can create new hidden nodes between existing connections. The new hidden node can have connections to any of the downstream neurons, and would have new continuous knobs added for all of those connections.
Sidenote: There appears to be debug-code that could be removed to potentially increase MOSES' performance. In build_knobs.cc, in disc_probe: (lines 255-271, lines 276-280, there are two tree copies and two reductions that are executed on each probe that are not used for anything as far as I can tell.)
One technical problem is that program trees do not overlap, while the naive graph codifying approach would have nodes that share children. I am not sure how big of a problem this is yet. My alternative approach is to use arguments #1...#n instead of having actual shared links to ANN-Nodes.
Todo: Add new operators to builtins, and extend str_to_vertex: opencog/comboreduct/combo/vertex.h
Top-level (output) operator: ANN (k-ary)
Input-node operator: ANN-Input (0-ary)
Output/Hidden-node operator: ANN-Node(2*k-ary) The first k children link to other (downstream, no cycles allowed), ANN-Node and ANN-Input nodes. The second k children are continuous constants that indicate the weights of those connections (from the first k-children to this node).
Todo: Create a new evaluator that creates an ANN from the graph codified representation. Model after eval_throws in: opencog/comboreduct/eval/h
Create a sample ANN combo tree for testing: (ANN
(ANN-Node (ANN-Node (ANN-Input ANN-Input 0.5 0.3) 0.7)))
Which corresponds to this network:
Output <-- Hidden Node <--- Input
<--- Input 0.3
Initially we will disable reduct (just to make it easier to test the codified structure).
After that, I think a symbolic derivative approach (or numerical approximation as a temporary step) to determine redundant weights could be effective. Ben mentioned using theshold activation functions to replace sigmoidal activation because threshold might be more amenable to reduction. At the moment, I am not optimistic about using the existing reduction engine on the codified ANN structures.
I've done some preliminary testing with random RNNs with 1-input and 1-output and many hidden nodes that demonstrates that many connections generate little or no impact on the final output (kind of similar to Moshe's experiments with random boolean formulas in his thesis). This indicates that a symbolic derivative reduction might work.