Unified rule engine

From OpenCog
(Redirected from URE Chainer Design)
Jump to: navigation, search

The unified rule engine (URE for short) is a generic opencog rule engine operating on the AtomSpace. Two chaining modes are currently supported, Forward Chaining and Backward Chaining. It is mostly built on top of the Pattern Matcher. For more history about its design see URE historical discussion.

Configuring the URE

For usage and configuration (rules and parameters) see URE Configuration Format.

Forward Chainer

The forward chainer applies rules from premises to conclusions. It currently uses a rather brute force algorithms, select sources and rules somewhat randomly, apply these to produce conclusions, re-insert them are new sources and re-iterate till a stop criterion has been met.

Forward Chainer Algorithm

The forward chainer is initialized with:

  • A set of sources, that may or not contain variables, from which to start the chaining process.
  • Optionally a focus set, to restrict rule application to a subset of the atomspace.
  • A rule-base, or rule-based system, defining a set of rules and parameters, amongst which are:
    • Fitness functions over sources and rules, to drive the inference control.
    • (Optional) Other parameters for termination and finer control. See URE Parameters for more details.


The high-level pseudocode of the forward chainer is as follows:

  1. Initialize the potential-source set with the initial source atoms.
  2. Randomly select the next source from the potential-source set (according to a source fitness function based distribution).
  3. Amongst all valid rules, randomly select one (according to a rule fitness function based distribution).
  4. Unify the selected rule with the selected source, and apply it to generate conclusions.
  5. Insert the conclusions to the potential-source set.
  6. Check if one of the stopping criteria has been met. If so, exit and return the potential-source set, otherwise repeat from step 2.

Fitness Functions

The source/rule fitness functions should reflect the likelihood that a source or a rule being selected will contribute the inference goal. Inference goal may be such as increasing novelty, confidence, etc. For now though the source fitness function is either disabled (uniform) or following the simple heuristic

F = s^x * c^(2-x)

where s is strength, c is confidence and x is some fixed value. That is it selects sources that have both high strength and confidence. This is largely arbitrary as it ultimately depends on the inference goal. Ideally, there should exist a reasoning chain going from the inference goal to the fitness function so that the latter is proven useful at least probabilistically (this is, BTW, one avenue for enabling inference control by meta learning and reasoning).

The rule fitness function is currently equally crude, relying on rule weight. The larger the weight the more likely to be selected.

Rule Unification and Application

Unifying source and rule, as required in step 4, means unifying the source pattern with one of the rule premises patterns. For instance if the source is:

InheritanceLink
  ConceptNode "John"
  VariableNode "$X"

and the rule is (a simplified version of deduction tailored for this example):

BindLink
  VariableList
    VariableNode "$A"
    VariableNode "$B"
    VariableNode "$C"
  AndLink
    InheritanceLink 
      VariableNode "$A"
      VariableNode "$B"
    InheritanceLink 
      VariableNode "$B"
      VariableNode "$C"
  ExecutionOutputLink
    GroundedSchemaNode "scm: deduction-formula"
    ListLink
      InheritanceLink
        VariableNode "$A"
        VariableNode "$C"
      InheritanceLink
        VariableNode "$A"
        VariableNode "$B"
      InheritanceLink
        VariableNode "$B"
        VariableNode "$C"

The rule premises are

 InheritanceLink
   VariableNode "$A"
   VariableNode "$B"

and

 InheritanceLink
   VariableNode "$B"
   VariableNode "$C"

The source here can be unified with both premises. Either with the first one, where

VariableNode "$A" -> ConceptNode "John"
VariableNode "$B" -> VariableNode "$X" 

or with the second one, where

VariableNode "$B" -> ConceptNode "John"
VariableNode "$C" -> VariableNode "$X" 

either way the rule can be applied over the atomspace, but before this it will be rewritten to be specialized from that source, that is (assuming the first unification is selected):

BindLink
  VariableList
    VariableNode "$B"
    VariableNode "$C"
  AndLink
    InheritanceLink 
      ConceptNode "John"
      VariableNode "$B"
    InheritanceLink 
      VariableNode "$B"
      VariableNode "$C"
  ExecutionOutputLink
    GroundedSchemaNode "scm: deduction-formula"
    ListLink
      InheritanceLink
        ConceptNode "John"
        VariableNode "$C"
      InheritanceLink
        ConceptNode "John"
        VariableNode "$B"
      InheritanceLink
        VariableNode "$B"
        VariableNode "$C"

By doing that we restrict the rule application over the selected source, decreasing its computational cost and pollution as well.

Termination

For now termination is solely determined by the number of forward steps applied. More termination criteria may be introduced in the future involving how well the inference goal is fulfilled, etc.

Backward Chainer

Conceptually one may see Backward Chaining as a search for a forward strategy that would lead to the target. That is in practice how the Backward Chainer algorithm works, by evolving Foward Chaining Strategies, FCS for short, that may prove, by construction, the given target.

Back-Inference Tree

The structure holding the evolving FCSs is called the Back-Inference Tree (BIT for short), and is inspired by Ari Heljakka's C++ Backward Chainer design. The difference here is that the BIT is represented in the AtomSpace. The discussion that led to this design can be found in this [issue]. A BIT is an and-or-tree, with a target at its root. The and-branches represent the multiplicity of premises that must be fulfilled in order to apply a rule, while the or-branches represent the multiplicity of rules that can match a target. Another way to represent it is as a collection of and-BITs, where each and-BIT represent a possible configuration of and-branches, that is obtained following a certain path along the or-branches. The collection of and-BITs then represent the various and-branches of or-branches, that is the BIT.

And-BIT

More specifically an and-BIT, or interchangeably a FCS, is a BindLink, looking like a rule with the exception that the rewrite term may be composed of multiple formula applications. For instance the FCS representing a deduction application to obtain target (in scheme format to display the atom values as well)

(InheritanceLink
  (VariableNode "$X") ; [6809909406030619949][1]
  (ConceptNode "D") ; [246112806454457922][1]
) ; [13295355732252075959][1]

would be

(BindLink
  (VariableList
    (VariableNode "$X") ; [6809909406030619949][1]
    (TypedVariableLink
      (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
      (TypeNode "ConceptNode") ; [3788388250666256890][1]
    ) ; [11215517665988510382][6]
  ) ; [15424546366970522031][6]
  (AndLink
    (InheritanceLink
      (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
      (ConceptNode "D") ; [246112806454457922][1]
    ) ; [16015351290941397556][6]
    (InheritanceLink
      (VariableNode "$X") ; [6809909406030619949][1]
      (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
    ) ; [17146615216377982335][6]
  ) ; [12118664779578609880][6]
  (ExecutionOutputLink
    (GroundedSchemaNode "scm: deduction-formula") ; [5481571512010723578][1]
    (ListLink
      (InheritanceLink
        (VariableNode "$X") ; [6809909406030619949][1]
        (ConceptNode "D") ; [246112806454457922][1]
      ) ; [13295355732252075959][1]
      (InheritanceLink
        (VariableNode "$X") ; [6809909406030619949][1]
        (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
      ) ; [17146615216377982335][6]
      (InheritanceLink
        (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
        (ConceptNode "D") ; [246112806454457922][1]
      ) ; [16015351290941397556][6]
    ) ; [12005927857233859901][6]
  ) ; [10461083420299819741][6]
) ; [16509059222210966642][6]

which looks like a rule after unification. This FCS may be represented graphically as a an up-side-down and-tree

[17146615216377982335][6] [16015351290941397556][6]
-----------------deduction-formula-----------------
             [13295355732252075959][1]

where the premises are over the line and the conclusion (the target) under the line.

An expanded FCS from it, where an additional deduction occurs may be

(BindLink
  (VariableList
    (VariableNode "$X") ; [6809909406030619949][1]
    (TypedVariableLink
      (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
      (TypeNode "ConceptNode") ; [3788388250666256890][1]
    ) ; [11215517665988510382][6]
    (TypedVariableLink
      (VariableNode "$B-6229393a") ; [6185394777777469381][6]
      (TypeNode "ConceptNode") ; [3788388250666256890][1]
    ) ; [15556655485509547465][6]
  ) ; [17281223568524105048][6]
  (AndLink
    (InheritanceLink
      (VariableNode "$B-6229393a") ; [6185394777777469381][6]
      (ConceptNode "D") ; [246112806454457922][1]
    ) ; [11133117073607658831][6]
    (InheritanceLink
      (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
      (VariableNode "$B-6229393a") ; [6185394777777469381][6]
    ) ; [12731261225409633207][6]
    (InheritanceLink
      (VariableNode "$X") ; [6809909406030619949][1]
      (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
    ) ; [17146615216377982335][6]
  ) ; [16361759001634035305][6]
  (ExecutionOutputLink
    (GroundedSchemaNode "scm: deduction-formula") ; [5481571512010723578][1]
    (ListLink
      (InheritanceLink
        (VariableNode "$X") ; [6809909406030619949][1]
        (ConceptNode "D") ; [246112806454457922][1]
      ) ; [13295355732252075959][1]
      (InheritanceLink
        (VariableNode "$X") ; [6809909406030619949][1]
        (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
      ) ; [17146615216377982335][6]
      (ExecutionOutputLink
        (GroundedSchemaNode "scm: deduction-formula") ; [5481571512010723578][1]
        (ListLink
          (InheritanceLink
            (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
            (ConceptNode "D") ; [246112806454457922][1]
          ) ; [16015351290941397556][6]
          (InheritanceLink
            (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
            (VariableNode "$B-6229393a") ; [6185394777777469381][6]
          ) ; [12731261225409633207][6]
          (InheritanceLink
            (VariableNode "$B-6229393a") ; [6185394777777469381][6]
            (ConceptNode "D") ; [246112806454457922][1]
          ) ; [11133117073607658831][6]
        ) ; [10363704109909197645][6]
      ) ; [18042231709829933293][6]
    ) ; [14032808276122395638][6]
  ) ; [12487963839188355478][6]
) ; [12956955273435071231][6]

where the second premise

(InheritanceLink
  (VariableNode "$B-6266d6f2") ; [4097372290580364298][6]
  (ConceptNode "D") ; [246112806454457922][1]
) ; [16015351290941397556][6]

has been replaced by a deduction application. This expanded FCS may be graphically represented by

             [12731261225409633207][6] [11133117073607658831][6]
             -----------------deduction-formula-----------------
[17146615216377982335][6] [16015351290941397556][6]
-----------------deduction-formula-----------------
             [13295355732252075959][1]

All atoms of an and-BIT (at least ones in the graphical representation are called BIT-nodes. Amongst the BIT-nodes, all expandable ones are called and-leaves, or targets depending on the context, here they are [17146615216377982335][6], [12731261225409633207][6] and [11133117073607658831][6].

Backward Chainer Algorithm

Similarily to the Forward Chainer, the Backward Chainer is initialized with:

  • A target, that may or not contain variables, from which to start the back chaining process.
  • Optionally, a focus set, to restrict rule application to a subset of the atomspace.
  • A rule-base, or rule-based system, defining a set of rules and parameters, amongst which are:
    • Fitness functions over and-BITs, and-leaves and rules, to drive the BIT expansion.
    • Other parameters for termination and finer control. See URE Parameters for more details.


The high-level pseudocode of the backward chainer is as follows:

  1. Expand the BIT. The first time, initialize the BIT with the target, otherwise:
    1. Randomly select an and-BIT (according to the and-BIT fitness).
    2. Randomly select an and-leaf of that and-BIT (according to the and-leaf fitness).
    3. Randomly select a rule amongst the valid ones (according to the rule fitness).
    4. Unify the select rule with the selected and-leaf, and expand the BIT.
  2. Fulfill the BIT.
    1. Select an and-BIT (typically the one just being expanded)
    2. Apply its corresponding FCS to the atomspace, insert the results to the solution set, if any.
  3. Reduce the BIT. To not let the BIT grow indefinitely large, randomly remove and-BITs that are the least likely to be expanded.
  4. Check if one of the stopping criteria has been met. If so, exit and return the solution set, otherwise repeat from step 1.

Fitness Functions

Like for the forward chainer the fitness functions are rather arbitrary for now and not formally tied to an inference goal. Though, as they have been coded thus far, the implicit inference goal is to maximize the confidence of the target. It is to be noted as well that the BC design allows the BC to assign a per BIT-node fitness, that is because fulfilling some inference goal on some target, may entail fulfilling another goal on some of its premises.

The and-BIT fitness functions are only 2, uniform or trace based, compounded with a complexity penalty measure. The complexity penalty disfavors overly complex and-BITs. Another way to look at it is that it controls whether BIT expansion follows a breath or depth first search. If the complexity penalty parameter is high then BIT expansion looks like a breath first search, otherwise it looks more like a depth first search. The trace based fitness function is merely for debugging, and rapidely recreate a certain series of expansions.

The and-leaf fitness function will favor and-leaves with low confidence, the rational being expanding these may ultimately increase their confidence.

Like for the forward chaining, rule fitness is controlled by their weights.

Target and Rule Unification

Unifying a rule and target, as required in step 1.4, means unifying the rule conclusion and the target. Unification works in same way as with the forward chainer, and a specialized rule is created at the issue of the unification. The only difference is that it is performed between the target and the rule conclusion instead of between the source and the rule premises.

Termination

For now termination is solely determined by the number of BIT expansions. More termination criteria may be introduced in the future involving how well the inference goal is fulfilled, etc.

Mixed Forward and Backward Chaining [Not implemented yet]

Given that the forward and backward chaining processes as described above are so similar, it is unsurprising that the two can be combined together into a single “forward/backward chaining” process. We may assume the forward/backward chaining process is initialized with:

  • A set of one or more source/target atoms.
  • Fitness functions.
  • A rule-base.
  • Optionally, a focus-set.

High-level pseudocode for the forward/backward chaining process would be as follows.

Initialize the potential-source-target list with the initial source/target Atoms Repeat:

  1. Choose the next source/target, from the potential-source-target list (via stochastic selection based on weights that are assigned to potential sources or targets based on multiple factors)
  2. Choose whether to apply a forward or backward step to the source/target (via some criterion, which may be a random selection, or may be a sophisticated judgment based on the context of the inference process)
  3. Choose a rule to apply to the source/target (selecting from among the admissible rules in the rule-base via stochastic selection, based on the weights of the rules in the current context. Also, if a forward step was chosen, select a position for the source to assume among the premises of the rule (in general this would also be done based on the weights of the rule positions in the current context.
  4. Choose premises for the rule (all the premises in the case of a backward step; additional premises in the case of a forward step) based on applying the PatternMatcher with an appropriate callback and a potential restriction to the focus-set (and in the case of a forward step, a potential restriction to the potential-source-target list)
  5. In the case of a forward step, apply the rule to the selected source premise and the chosen additional premises, and generate a conclusion
  6. Check if one of the stopping criteria has been met. If so, exit the loop and stop repeating.
  7. Push the result onto the potential-source-target list: for a forward step, the generated conclusion; for a backward step, the selected premises
  8. Return to the start of the loop

Forward-backward inference could be triggered with either forward chaining or backward chaining type queries. Sometimes doing a bit of forward inference along the way can improve one’s backward inference; and vice versa.