Unified rule engine

From OpenCog
(Redirected from Unified Rule Engine)
Jump to: navigation, search

The unified rule engine project aims at building a generic opencog rule engine on top of the Pattern Matcher. The current design is drawn from conclusion arose from the URE historical discussion.

Chainer design for a unified OpenCog rule engine

This document gives a sketch of a design for a general-purpose OpenCog rule chainer. It is not very expository, and is intended as part of a dialogue among a group of folks who understand what’s going on with the new PLN chainer initiative. It will be expanded into a more expository document as time goes on and design/implementation continues.

The document is written largely with PLN chaining in mind, but the chainer described is intended to work much more generally. It begins with backward chaining, and then turns to forward and forward-backward chaining at the end.

For the URE usage and configuration format of the actual implementation see URE Configuration Format.

Overall Backward Chaining Process

We may assume the backward chaining process is initialized with

  • A target Atom
  • A fitness function (see below)
  • A rule-base (given as a set of Atoms)
  • Optionally, a focus-set of Atoms to search

High-level pseudocode for the backward chaining process described here would be as follows.

Initialize the potential-target list with the initial target Atom

Repeat

  1. Choose the next target, from the potential-target list (via stochastic selection based on weights that are assigned to potential targets based on multiple factors)
  2. Choose a rule to apply to the target (selecting from among the admissible rules in the rule-base via stochastic selection, based on the weights of the rules in the current context). (Note that each rule also indicates a fitness function to be used for its premises, based on the fitness function for its conclusion.)
  3. Choose premises for the rule via searching the “focus set” specified, using the PatternMatcher (with an appropriate callback to guide the PM’s search based on the fitness functions for the premises)
  4. Check if one of the stopping criteria has been met. If so, exit the loop and stop repeating.
  5. Push the selected premises onto the potential-target list
    1. Alternatively, when no premise is available, add the rule's input to the potential-target list
  6. Return to the start of the loop

Types of Targets and Fitness Functions for Backward Chaining

The backward chainer (BC) can be triggered with several possible target types, each coming with possible fitness functions. The set of target types and fitness functions should be considered as flexible rather than fixed for all time. Target types that are useful for PLN include:

1) Find the truth value of a given Atom A, e.g.
InheritanceLink cat animal <s,c>
In this case the goal is to find an estimate of the truth value of A with confidence as high as possible.So we will say the :“fitness function” is confidence.
2) Given an Atom A whose targets include variables, find assignments of values to the variables for which the truth value of A is :known as confidently as possible. E.g. if the target were
InheritanceLink $X animal
then the goal would be to find $X for which it’s confidently known whether or not $X is an animal.
3) Given an Atom A whose targets include variables, find assignments of values to the variables that make the truth value of A as “large” as possible. For instance, we might have
A = InheritanceLink $X animal <s,c>
In this case, it is not obvious how to quantify the goal of the “largeness” of the truth value of A. A flexible approach: would be to look at a fitness criterion constituting . This lets one balance strength versus confidence.
4) In some cases, it may be useful to pursue a variant of 3) in which one tries to find assignments of values to variables: that make the truth value of A as “small” as possible, quantified e.g. by the fitness criterion Even if the original target of BC is of type 3), the BC process may create a target of type 4) as a subgoal. An example would be an XOR relation. If one has C = XOR(A,B) (where A, B and C all involve variable $X), then in order to maximize s*c of C, one could either:
  • maximize s*c of A and minimize s*c of B, or
  • minimize s*c of A and maximize s*c of B

A Single Step

A single backward-chaining step contains the following substeps. Beginning with a target T, it:

  • Chooses a rule that is capable of yielding the target T as output
  • Chooses inputs (premises) for the rule, which do in fact yield the target T as output

Either of the above choices may be made with the fitness criterion associated with the target in mind.

Rule Selection

How to choose a rule capable of yielding T as output? Firstly, the selection is made among the rules in the “rule base” R that has been associated with the current instance of the backward chainer, at the time of its launch. Information regarding the format of the output of each of these rules must be stored in the AtomSpace. In this case, T can be compared to the output formats of the rules in the rule-base R, and in this way a set of admissible rules can be selected.

Next, each admissible rule is assumed to have a certain weight, which indicates the probability of its selection. For an initial implementation, each rule in the rule-base may be assigned a certain constant weight (e.g. deduction may have a higher weight than a quantifier rearrangement rule, etc.). For a more advanced implementation, rule weights must be determined based on the specific context (the target, and the previous context of the inference process, and maybe other information). There should be some sort of getWeight(Rule R) function, which initially can be filled simply, and later can be made more complex.

To store information regarding rule output format in the Atomspace, a new link type such as ExecutionRecordLink could be used, e.g.

ExecutionRecordLink
   GroundedSchemaNode "scm:PLNDeductionRule"
   SetLink
      Implication $A $B
      Implication $B $C
   SetLink
      Implication $A $C
   SetLink
      Equivalence
         ExecutionOutputLink
            GroundedSchemaNode: TruthValueOf
               Implication $A $C
         ExecutionOutputLink
            GroundedSchemaNode "scm:PLN_Deduction_Formula"
               ListLink
                  ExecutionOutputLink
                     GroundedSchemaNode: TruthValueOf
                     Implication $A $B
                  ExecutionOutputLink
                     GroundedSchemaNode: TruthValueOf
                     Implication $B $C
                  ExecutionOutputLink
                     GroundedSchemaNode: TruthValueOf
                     $A 
                  ExecutionOutputLink
                     GroundedSchemaNode: TruthValueOf
                     $B

Mechanically, one way to find the set of admissible rules would be to apply the PatternMatcher, restricted to the scope of ExecutionRecordLinks corresponding to rules in the rule-base associated with the BC instance.

In the case where the rule-base is not that large (our initial case of interest), then one can run the PM in the mode where it finds all matches, thus obtaining the list of all admissible rules for the given target. Choice among the admissible rules may then be done as a separate step, using e.g. tournament or roulette-wheel selection based on the rule weights.

In the case of a large rule-base, one should run the PM in a mode where it tries to find several matches, but doesn’t keep going till it finds all matches. A special callback might be used here, e.g. STI-based search could be used if STI correlated with rule weight. This is not our initial case of interest so we can deal with the details later.

Premise Selection

Premise selection is a different sort of problem than rule selection, because (in PLN, at any rate) typically the number of rules in the rule base is not that large, whereas the set of possible premises may be immense.

Given a rule and a target, one can calculate a template or “pattern” to guide premise selection. For instance if the target is

   Inheritance cat animal

and the rule selected is Deduction, then the pattern would be

   Inheritance cat $B
   Inheritance $B animal

This pattern is provided to the PatternMatcher as its target. The PM then searches to find a match that maximizes the given fitness function as well as possible.

The PM may use a custom callback that seeks to find matches maximizing STI, as well as the given fitness function. In this case some balance must be struck between these factors, inside the PM’s callback. When sorting elements of the incoming set during its internal search process, it might use a comparator such as c * (fitness function satisfaction) + (1-c) * STI

On some occasion, it is possible no premise match the “pattern”. In this case, we should add the “pattern” itself as a new potential target.

Chaining

There are many different control processes appropriate for a backward chainer. One possibility is a probabilistic expansion process, where each possible step (each possible single step expanding the BC tree) is assigned a certain weight, and a step is then chosen via roulette or tournament selection.

This requires that the BC retain a list (perhaps a SetLink) pointing to all the targets explored during the BC process so far, along with all the Atoms identified or produced so far via applying rules to these targets. Any one of these is always a candidate for possible expansion (i.e. for use as a target for the next step of BC inference). This list is in effect the “backward chaining inference tree.” (If a record is stored in the Atomspace (or in an auxiliary InferenceHistory AtomSpace) indicating which inferences have been done during the course of backward chaining, then the whole BC tree can be reconstructed from this record.)

The weight assigned to a candidate should be determined via some sort of getExpansionCandidateWeight() function. Factors that might go into a simple calculation of candidate weight would include:

  • How much fitness value has been obtained via pursuing the candidate so far (this would increase weight)
  • How much effort has been spent pursuing the candidate so far (this would decrease weight) The STI of the candidate
  • An “Occam factor” that rewards simple candidates and penalizes complex ones. Here complexity may be gauged in terms of size, but potentially also including some a priori factors that weight some Atom types as simpler than others (e.g. variables and quantifiers might be considered more complex than concrete terms).
  • Potentially, a reward for being close to the original target in the inference tree (penalizing targets that involve long inference chains to get to the original target). The magnitude of this factor will determine how depth-first-like versus breadth-first-like the search is.

Backward chaining is a recursive process. To explain how it works, it suffices to explain how a single step is done, how the next step is chosen, and to describe the stopping criterion.

Stopping Criteria

Several types of stopping criteria may be used (for stopping the entire backward chaining process):

  • The amount of effort spent, most simply (though not that accurately) measured as the number of inference steps taken
  • The quality of results obtained, as measured by a target value for the goal criterion (e.g. confidence above .9; s*c > .6; etc.).
  • Decreasing returns: if the last K inference steps have not yielded more than a certain amount (say, .01) of improvement in the target value, then it may be judged best to just give up the inference process.

These criteria may also be combined.

Example

A partially worked example of a backward chain, illustrating fitness function dynamics Suppose one of the steps in the BC involves addressing the target InheritanceLink $X alive with s*c as a fitness function. One possible chain discoverable by the BC might be as follows: InheritanceLink Bob alive ==> XOR rule AND

  XORLink
     Inheritance Bob alive  
     Inheritance Bob dead

Inheritance Bob dead ==> [unification rule] AverageQuantifier $X [123]

   XORLink
       Inheritance $X alive
       Inheritance $X dead

Inheritance Bob dead <0,.99>

and then

Inheritance Bob dead <0,.99>

==>[unification rule]

Evaluation walks Bob

AverageQuantifier $X
   Implication <.9>
       Evaluation Walks $X
       AnchorNode “123”
EmbeddedTruthValueLink <0,.99>
      AnchorNode “123”
      Inheritance $X dead

This example illustrates the potential shifting of fitness function within an inference chain. Due to the nature of the XOR rule, in order to conclude that Bob is alive with a high strength (and high confidence) from the Atom labeled [123], it is necessary to conclude that Bob is dead with a low strength (and high confidence). So the XOR rule spawns a target with, say, (1-s)*c fitness function from a parent target with s*c fitness function.

Overall Forward Chaining Process

Forward chaining may be handled quite analogously to backward chaining. We may assume the forward chaining process is initialized with:

  • A set of one or more source Atoms
  • A fitness function (specified for the conclusion of each inference)
  • A rule-base (given as a set of Atoms)
  • Optionally, a focus-set of Atoms to search

High-level pseudocode for the forward chaining process described here would be as follows.

Initialize the potential-source list with the initial source Atoms

Repeat:

  1. Choose the next source, from the potential-source list (via stochastic selection based on weights that are assigned to potential sources based on multiple factors)
  2. Choose a rule to apply to the source (selecting from among the admissible rules in the rule-base via stochastic selection, based on the weights of the rules in the current context. Also, select a position for the source to assume among the premises of the rule (in general this would also be done based on the weights of the rule positions in the current context).
  3. Choose additional premises for the rule via one of two options: A) searching the “focus set” specified, or B) searching the potential-source list itself. In each case, search proceeds using the PatternMatcher (with an appropriate callback to guide the PM’s search based on the fitness functions for the additional premises)
  4. Apply the rule to the selected source premise and the chosen additional premises, and generate a conclusion
  5. Check if one of the stopping criteria has been met. If so, exit the loop and stop repeating.
  6. Push the generated conclusion onto the potential-source list
  7. Return to the start of the loop


Note, this is almost the same as the backward chaining process. The only differences are: the addition of step 7, and the fact that in step 3 one is searching for additional premises for the chosen rule, rather than searching for all the premises. Also, of course, the selection of which rules are admissible is based on which rules have some premise matching the source, rather than (as in backward chaining) which rules have a conclusion with an element matching the target.For example, suppose

  Inheritance cat animal

is chosen as a source for forward chaining, and that Deduction is chosen as a rule (based on the fact that deduction takes a premise of the format Inheritance $A $B). Then the pattern created will look like

  Inheritance cat animal
  Inheritance animal $X

or

  Inheritance $X cat
  Inheritance cat animal

depending on which position is chosen for the source. Assuming the first case, then the query submitted for pattern matching will be simply

  Inheritance animal $X

The same subtleties regarding fitness functions occur with forward chaining as with backward chaining: the fitness function specified for the conclusion, may imply a different fitness function to be used in the search for the additional premise.

Mixed Forward and Backward Chaining

Given that the forward and backward chaining processes as described above are so similar, it is unsurprising that the two can be combined together into a single “forward/backward chaining” process. We may assume the forward/backward chaining process is initialized with:

  • A set of one or more source/target Atoms
  • A fitness function (specified for the conclusion of each inference)
  • A rule-base (given as a set of Atoms)
  • Optionally, a focus-set of Atoms to search

High-level pseudocode for the forward/backward chaining process would be as follows.

Initialize the potential-source-target list with the initial source/target Atoms Repeat:

  1. Choose the next source/target, from the potential-source-target list (via stochastic selection based on weights that are assigned to potential sources or targets based on multiple factors)
  2. Choose whether to apply a forward or backward step to the source/target (via some criterion, which may be a random selection, or may be a sophisticated judgment based on the context of the inference process)
  3. Choose a rule to apply to the source/target (selecting from among the admissible rules in the rule-base via stochastic selection, based on the weights of the rules in the current context. Also, if a forward step was chosen, select a position for the source to assume among the premises of the rule (in general this would also be done based on the weights of the rule positions in the current context.
  4. Choose premises for the rule (all the premises in the case of a backward step; additional premises in the case of a forward step) based on applying the PatternMatcher with an appropriate callback and a potential restriction to the focus-set (and in the case of a forward step, a potential restriction to the potential-source-target list)
  5. In the case of a forward step, apply the rule to the selected source premise and the chosen additional premises, and generate a conclusion
  6. Check if one of the stopping criteria has been met. If so, exit the loop and stop repeating.
  7. Push the result onto the potential-source-target list: for a forward step, the generated conclusion; for a backward step, the selected premises
  8. Return to the start of the loop

Forward-backward inference could be triggered with either forward chaining or backward chaining type queries. Sometimes doing a bit of forward inference along the way can improve one’s backward inference; and vice versa.