Unified rule engine
From OpenCog
The idea of moving to a single unified rule engine for various OpenCog rule-based processes keeps coming up: e.g. RelEx, RelEx2Frame, PLN,...
Contents |
Existing "General" Rule Engines
From Ben's, Linas', Moshe's emails...
It seems that the first step in this regard should be to study existing open-source rule engines a bit....
I really doubt any of them out there could serve our purposes ... but perhaps looking at these generalized rule engines should help us in designing our own.... (And of course there's always the remote chance one of them will prove adequate...)
Most open-source rule engines are based on the Rete algorithm.
There is also Rete II, which I don't understand. Rete has many drawbacks e.g. it performs poorly when matching rules against dynamic data.
The best-known open-source rule engine is DROOLS. which is in Java. I used something called RC++ years ago but I suspect it hasn't been maintained.
DROOLS currently only does forward chaining, but they say a backward chainer will be supported in the next release.
Its big and not easily grokked. A quick skim through the docs hints that it implements the equivalent of what we call the AtomSpace; they call it "WorkingMemory". It seems to assume crisp logic.
The "facts" that are inserted are Java objects. This is what Drools calls OO RETE. For instance, if one inserts in WorkingMemory a Java object with 2 properties like this:
Person p = new Person();
p.setFirstName("Jimi");
p.setLastName("Hendrix");
wm.insert( p );
this asserts the 100% truth of properties assignments FirstName and setLastName on p .
User:Jvanel , a Drools expert added this :) .
It also separates the RuleBase (where the
rules are kept) from the WorkingMemory, whereas
we want the two to be mashed into one. Whether
Drools would allow the two to intermingle is unclear.
If I was starting from scratch, and was a big Java fan, then Drools would be the way to go. ... but err, my gut instinct is to stay away.
OTOH, Jess has a more annoying license making it not suitable for OSS use, but has a backward chainer...
The classic is CLIPS which has this improved closed-source version as well.
Does Alchemy have anything that resembles what we would call a rule engine? Their web page makes no hint of this, and yet there's that slippery word "inference" ...
Basic Requirements
Anyway ... if none of these or others like them proves suitable (which is what I suspect), it seems one path we might take is to develop our own generic rule engine as a separate module, i.e. a rule engine which uses rules stored in the AtomSpace format but other than that operates generically without specialization to any particular OpenCog AI process.
We'd then need to decide what chaining algorithms to use ... presumably we could make the forward and backward chaining algorithms pluggable, so that different MindAgents could potentially invoke the OpenCogRuleEngine and specify different chaining algorithms (e.g. some chaining algorithms might provide better real-time response whereas others might be smarter if you have more time available...)
Linas: That would be my impression. The rule-engine would need to be peppered with lots of little call-outs, so as to compare atoms appropriately at various times.
That is, the rule engine *algorithm* would need to be completely detached from the data representation that it operates on. I get the feeling that some rule engines don't detach in this way. Other requirements:
- Doesn't assume crisp logic, i.e. provides callbacks where we can plug in calculations to meld confidence, truth, importance, etc.
- Doesn't have a compile step: i.e. doesn't assume that the rules come from some file, and are compiled into code.
Advanced requirements
1) I took a declarative approach (probably easier in Lisp, but doable in C++) that seems to have been successful in decreasing the incremental complexity of adding a new rule to the system. Instead of manually telling the system when to apply rules and in what order, a rule declaration has an associated type, preconditions, and postconditions (i.e. "rule X assumes that rule Y has already been applied (precondition) and if rule Z has already been applied, then it need not be reapplied after rule X has been applied (postcondition)"). The central controller synthesizes all of this information into a DAG, and does topological sort to obtain the actual order that rules are applied in for some type.
2) You really want to have some mechanism for keeping track of when a subtree is simplified according to some rule, to avoid having to reapply a expensive rules to trees that are in fact simplified. There are two basic ways to do this, a cache (i.e.. hash table) mapping trees to their simplifications, and annotations on subtrees indicating if they are simplified, and according to which rules. Unfortunately both have their drawbacks (Richard Fateman has some nice papers kvetching about this, IIRC), the main one in both cases being that "simplified" is in fact context-dependent (e.g. X AND Y is simplified on its own, but not if it is embedded into a clause such as X OR (X AND Y)). I chose the latter, and suffice it to say that a fair bit of finesse is required to get things fully bug-free...
3) There is lots of know-how from the folks that built computer algebra systems (stretching back 40+ years) that is worth reading up on...
Representing Rules in the AtomSpace
The distinction between what PLN calls "rules" and most Rule engines call "rules" is slightly different. Or rather, PLN has a separate layer of base rules that alter and synthesise axioms in the system.
One way to represent the PLN rules in the AtomSpace, while also allowing them to use specific formulas for the consequent, is to use the following representation (Deduction rule example):
ImplicationLink
AND
InheritanceLink $A $B
InheritanceLink $B $C
EquivalenceLink
ExOutLink GetTruthValue (InheritanceLink $A $C)
ExOutLink
GroundedSchemaNode: DeductionFormula
ListLink
ExOutLink GetTruthValue $A
ExOutLink GetTruthValue $B
ExOutLink GetTruthValue $C
ExOutLink GetTruthValue (InheritanceLink $A $B)
ExOutLink GetTruthValue (InheritanceLink $B $C)
Specific Design Requirements
There are some aspects related to the architecture of the new Unified Rule Engine which must be considered, given the acquired experience at developing and making use of a Rule Engine for Embodiment:
- Action execution callback: it would be great if a listener system was develop to notify a list of listener objects, that a specific action was selected and executed in a given cycle.
- Avoiding repeating the same action: maybe it would be a good solution to provide a mechanism that permits consulting a history of executed actions and using that information as preconditions, rules strengths, etc. Maybe creating a minimal Common Sense reasoning engine to do that work can be a good solution.
Current plans
Linas is currently (sept 2009) making plans to implement some of the ideas above. The core ideas/steps are as follows:
- The current query/pattern matcher can evaluate implication links, and has a sufficiently rich set of callbacks to provide the flexibility described above. Unfortunately, it is slow; as it is essentially a "naive" rule engine, and provides no caching, etc. Linas believes he knows how to add Rete-algo-type caching structures to vastly improve performance (possibly in exchange for significant mem usage). However:
- To implement the Rete-algo data structures, a new set of indexes are required for the atomtable. Currently, the atom table provides 5 "hard-wired" indexes, and no pluggable index infrastructure. So a preliminary step is to add pluggable indexes (and convert the existing indexes to that infrastructure).
Simplified ImplicationLinks
Currently, writing ImplicationLinks by hand is quite tedious, as the syntax is not very compact. Thus, a separate, distinct project would be to create a new, more compact language that can be quickly/trivially translated into ImplicationLinks. Linas believes that a very good choice would be patterned on the syntax of Prolog. This is in part because current RelEx output is very prolog-ish already, and because most of the casual email discussions already tend to use a casual prolog-like syntax. So the idea is to formalize this a bit -- its already the way that people are thinking about things.
Note: the Schelog project allows embedding of prolog into Scheme as a first-class guest. Unfortunately, this can't really work within OpenCog, but something similar would be nice.