Integrative Procedure and Predicate Learning
Integrative Procedure Learning
This chapter discusses two closely related aspects of OCP: schema learning and predicate learning, grouped under the general category of "procedure learning." The material is integrative — we explain how various ideas presented in previous chapters may be used to provide a powerful integrative-intelligence approach to the learning of complex declarative and procedural knowledge. Much of this material is "advanced AGI" which does not yet exist in the current version of OpenCog, but will be critical for the realization of OpenCog's ultimate ambitions.
Schema learning — the learning of SchemaNodes and schema maps — is OCP lingo for learning how to do things. Learning how to act, how to perceive, and how to think — beyond what's explicitly encoded in the system's CIM-Dynamics. Ultimately, when OCP becomes more profoundly self-modifying, schema learning will be the main process driving its evolution.
Predicate learning, on the other hand, is the most abstract and general manifestation of pattern recognition in the OCP system. PredicateNodes, along with predicate maps, are OCP's way of representing general patterns (general within the constraints imposed by the system parameters, which in turn are governed by hardware constraints). Predicate evolution, predicate mining and higher-order inference — specialized and powerful forms of predicate learning — are the system's most powerful ways of creating general patterns in the world and in the mind. Simpler forms of predicate learning are grist for the mill of these processes.
All aspects of OCP are difficult, but some are more difficult than others — and procedure learning definitely falls into the latter category. Specifically, of all the AI tasks with which we have experimented, schema learning has proved the trickiest. As the chapter unfolds, we will try to convey the nature of this difficulty, along with the solutions we suggest and why we believe they will be successful.
It may be useful to draw an analogy with another (closely related) very hard problem in OCP, probabilistic logical unification (which in the OCP framework basically comes down to finding the SatisfyingSets of given predicates). Hard logical unification problems can often be avoided by breaking down large predicates into small ones in strategic ways, guided by non-inferential mind processes, and then doing unification only on the smaller predicates. Our limited experimental experience indicates that the same "hierarchical breakdown" strategy also works for schema and predicate learning, to an extent. But still, as with unification, even when one does break down a large schema or predicate learning problem into a set of smaller problems, one is still in most cases left with a set of fairly hard problems.
More concretely, OCP procedure learning may be generally decomposed into three aspects:
- Converting back and forth between maps and compound ProcedureNodes (encapsulation and expansion)
- Learning the Combo Trees to be embedded in grounded ProcedureNodes
- Learning procedure maps (networks of grounded ProcedureNodes acting in a coordinated way to carry out procedures)
These difficult problems are dealt with using a combination of techniques:
- Probabilistic inference (PLN)
- Evolutionary programming (PEL)
- Reinforcement learning (i.e. inference on HebbianLinks)
- Procedure map encapsulation and expansion
- Pattern-formation-driven predicate evolution
We are relying on a combination of techniques to do what none of the techniques can accomplish on their own. The combination is far from arbitrary, however. As we will see, each of the techniques involved plays a unique and important role.
Each of the three aspects of OCP procedure learning mentioned above may be dealt with somewhat separately, though relying on largely overlapping methods.
First, the conversion aspects will be dealt with later on in the more general context of map encapsulation and expansion. However a few comments on procedure encapsulation and expansion may be useful here.
Converting from grounded ProcedureNodes into maps is a relatively simple learning problem. The challenge is choosing among the many possible ways to split a Combo tree into subtrees. Once this is decided, it's easy enough to create separate ProcedureNodes corresponding to the separate subtrees. Determining when to expand compound ProcedureNodes into maps can also be subtle; the process is guided by Goal Nodes and will be discussed later on.
Converting from maps into ProcedureNodes is significantly trickier. First, it involves carrying out datamining over the network of ProcedureNodes, identifying subnetworks that are coherent schema or predicate maps. Then it involves translating the control structure of the map into explicit logical form, so that the encapsulated version will follow the same order of execution as the map version. This is an important case of the general process of map encapsulation, to be discussed in the final chapter.
Next, the learning of grounded ProcedureNodes is carried out by a synergistic combination of two mechanisms: evolutionary programming, and logical inference. These two approaches have quite different characteristics. Evolutionary programming excels at confronting a problem that the system has no clue about, and arriving at a reasonably good solution in the form of a schema or predicate. Inference excels at deploying the system's existing knowledge to form useful schemata or predicates. The choice of the appropriate mechanism for a given problem instance depends largely on how much relevant knowledge is available.
A relatively simple case of compound ProcedureNode learning is where one is given a ConceptNode and wants to find a ProcedureNode matching it. For instance, given a ConceptNode C, one may wish to find the simplest possible predicate whose corresponding PredicateNode P satisfies
SatisfyingSet(P) = C
On the other hand, given a ConceptNode C involved in inferred ExecutionLinks of the form
ExecutionLink C Ai Bi i=1,...,n
one may wish to find a SchemaNode so that the corresponding SchemaNode will fulfill this same set of ExecutionLinks. These kinds of compound ProcedureNode learning may be cast as optimization problems, and carried out by evolutionary programming. Once learned via evolutionary programming or other techniques, they may be refined via inference.
The other case of compound ProcedureNode learning is goal-driven learning. Here one seeks a SchemaNode whose execution will cause a given goal (represented by a Goal Node) to be satisfied. The details of Goal Nodes have already been reviewed; but all we need to know here is simply that a Goal Node presents an objective function, a function to be maximized; and that it poses the problem of finding schemata whose enaction will cause this function to be maximized.
The learning of procedure maps, on the other hand, is carried out by reinforcement learning. This is a matter of the system learning HebbianLinks between ProcedureNodes, as described in the section Learning Schema Maps.
Note that compound ProcedureNodes learned via evolution and refined via inference may be translated into procedure maps; or, procedure maps learned via reinforcement learning may be translated into compound ProcedureNodes.
Comparison with Other Approaches To Procedure Learning
It is perhaps worth briefly drawing some contrasts with known alternative approaches. Instead of a complete literature review, we merely provide a selective discussion of some alternatives that have seemed interesting to the authors for one reason or another.
One alternative, well known to the original author of this page and several colleagues (including members of the Novamente team) if not to many others, is the Webmind 2000 schema module. This design will not be described here in detail, but some brief indications may be useful. In this approach, complex, learned schema were represented as distributed networks of elementary SchemaInstanceNodes, but these networks were not defined purely by function application - they involved explicit passing of variable values through VariableNodes. Special logic-gate-bearing objects were created to deal with the distinction between arguments of a SchemaInstanceNode, and predecessor tokens giving a SchemaInstanceNode permission to act. The basic problem, apart from implementational inefficiency, was a severe brittleness. Small modifications to an effective schema map would too frequently yield a completely dysfunctional schema. The main problem here was the overcomplexity of the design. Everyone who worked with this system, theoretically or pragmatically, clearly realized that it was infeasibly complex in terms of the human effort required to get all the parts to work together. There seems little doubt that OCP's Combo tree based approach is preferable. The difference is that in OCP, procedural knowledge is stored and manipulated separately from declarative knowledge; whereas in Webmind 2000, procedural knowledge was essentially treated as a special case of declarative and associational, Hebbian-type knowledge, which is mathematically and conceptually correct, but unacceptably inefficient. It turns out that imitating the human brain's decision to separate declarative and procedural knowledge, as an abstract design choice, is a solid idea in terms of achieving design tractability and computational feasibility.
Next, let us consider the procedure-learning methods preferred by the traditional AI community. (Note: this review is extremely brief and inadequate, and should probably be spun out into a separate page at some point. On the other hand, the reviews of AI planning on wikipedia and elsewhere online are not written from an AGI perspective. Someone should write a constructively-critical review of planning software and algorithms from an AGI perspective.)
There are numerous planning methods out there, based on logical inference, graph analysis, simplified probabilistic inference and other techniques — but overall, the performance of these systems is not too impressive. These programs only work effectively given very particular constraints as to the problems they are posed. And almost none of them deal adequately with uncertainty. In OCP terms, these approaches are basically specialized inference control processes. Among our favorites among the existing AI planning approaches is Probabilistic GraphPlan (Avrim and Langford, 1999), but even this is too oversimplified to deal with most real-world planning problems.
There are neural-net-based procedure learning methods, most notably various variations on recurrent backpropagation. But recurrent backprop is an inefficient algorithm, not really scalable to large learning problems (though attempts have been made to work around this issue by recourse to massively parallel implementations; see e.g. Suresh et al, 2005 ). [fuckwad4] The problems with these techniques were resolved partially by recent work on Long Short-Term Memory networks (Gers and Schmidhuber, 2001), but even this work does not fully do the trick; like more traditional work with recurrent neural nets under gradient descent learning, it shows a tendency to either fall into local optima or take forever to converge on a solution. Hochreiter and Mozer (2001) tried to improve on LSTM networks by abandoning the neural net aspect altogether and proposing a recurrent-net-ish approach based on probability theory; this improved learning in many cases but showed an even worse tendency to fall into local optima — suggesting that a more general probability-based approach is necessary, like the one used in OCP.
Other reinforcement-learning-style methods like Hebbian learning have also failed to scale to very complex problems like the one at hand. Gerald Edelman's (1987) Neural Darwinism approach, which is more a brain theory than a crisp implementable algorithmic formulation, is conceptually somewhat similar to the approach we propose here: he has Hebb-style learning leading to the concretization of neuronal maps which are basically neuronal subnetworks that act as encapsulated schema, while still physically overlapping.
And then there is the genetic programming approach, which also has scaling problems, though these can be dealt with more effectively than in the recurrent backprop case. As indicated earlier, we feel that our approach to evolutionary programming resolves GP scaling problems to a significant extent. But we also feel that evolutionary programming is suited only to be part of the solution to procedure learning, not the whole solution.
Finally, there are classifier systems, which will be discussed below (the reinforcement learning aspect of OCP procedure learning bears a significant resemblance to classifier system learning). They share with recurrent backprop some very basic scalability issues. This is why, though these systems have been around some time, there are still no industrial-strength, large-scale applications. Eric Baum's Hayek system (Baum, 2004) solves some of the problems associated with classifier systems, but its learning algorithm is extremely slow, and also highly erratic, requiring substantial customization for each new problem it is applied to.
All in all, we believe that the proposed OCP schema learning approach is conceptually much more advanced than other things that are out there, primarily due to the combination of evolutionary and inferential techniques, and sophistication with which procedure learning is integrated with other cognitive processes.
We are optimistic about the performance of this critical subset of OCP dynamics, though we are also confident that substantial tuning — and substantial processing power — will be required to obtain optimal performance.
Summary Remarks on Procedure Learning
The OCP approach to procedure learning, put as simply as possible, involves
- Evolutionary procedure learning for dealing with brand new procedure learning problems, requiring the origination of innovative, highly approximate solutions out of the blue
- Inferential procedure learning for taking approximate solutions and making them exact, and for dealing with procedure learning problems within domains where closely analogous procedure learning problems have previously been solved
- Heuristic, probabilistic data mining for the creation of encapsulated procedures (which then feed into inferential and evolutionary procedure learning)
- PredictiveImplicationLink formation (augmented by PLN inference on such links) as a OCP version of goal-directed reinforcement learning
Using these different learning methods together, as a coherently-tuned whole, one arrives at a holistic procedure learning approach that combines speculation, systematic inference, encapsulation and credit assignment in a single adaptive dynamic process.
It is also worth noting that this hybrid algorithm is highly amenable to distributed processing, even globally distributed processing. Evolutionary programming, probabilistic data mining, and PredictiveImplicationLink formation are extremely well-distributable. Inference is a little less so since it relies on background knowledge at every step; but one can easily envision a procedure learning system that did globally distributed GP and reinforcement learning across millions of widely dispersed machines, and then turned the results of this over to a less widely distributed system carrying out inference.