URE Configuration Format

From OpenCog
Jump to: navigation, search

This document describes the Unified Rule Engine configuration format. For a complete practical usage example you may want to have a look at /examples/rule-engine and /examples/pln.

Quick Start

Here's a quick sketch of how to set up some rules for the rule engine:

 ; Define some rules
 (DefineLink
    (DefinedSchema "foo")
    (BindLink ...))
 
 (DefineLink
    (DefinedSchema "bar")
    (BindLink ...))
 
 ; Give a name to the rulebase
 (Inheritance
    (Concept "my-rule-base")
    (Concept "URE"))
 
 ; Add the rules to the rulebase
 (Member
    (DefinedSchema "foo" (stv 0.4 1))
    (Concept "my-rule-base"))
 
 (Member
    (DefinedSchema "bar" (stv 0.6 1))
    (Concept "my-rule-base"))
 
 ; Create a convenience wrapper.
 (define (my-forward-chainer SRC)
    (cog-fc (Concept "my-rule-base") SRC)

Each of these steps is explained and reviewed in greater detail below.

Configuration overview

  • All configuration parameters for the URE live in the AtomSpace.
  • Multiple configurations for different rule systems (PLN, R2L, Atomese Reduct, etc) may co-exist side by side on the same AtomSpace.
  • Rule systems can be decomposed into subsystems (R2L into R2L-en, R2L-in, etc).

Given those constraints the following initial format is suggested. It is inspired by (and almost identical to) Amen's r2l-en config file /opencog/nlp/relex2logic/r2l-en-rulebase.scm.

Naming rulesets

A set of rules is named by using a ConceptNode. For example,

ConceptNode "PLN"
ConceptNode "R2L"

names two differnt rule systems: PLN and R2L. These systems may be subdivided into subsystems. For example:

ConceptNode "PLN-crisp"
ConceptNode "PLN-uncertain"
ConceptNode "PLN-quantifier"

Subsystems should be connectd to each other by inheritance relationships:

InheritanceLink
   ConceptNode "PLN-quantifier"
   ConceptNode "PLN"

These inheritance relationships are used to pass configuration parameters to the subsystems. For instance, if the control policy is the same across all the PLN subsystems, the parameters pertaining to the control policy will only need to be defined on "PLN".

All systems maybe inherit from

ConceptNode "URE"

For example:

InheritanceLink
   ConceptNode "PLN"
   ConceptNode "URE"

Any parameter set to URE will automatically be inherited by all systems, unless they are overwritten within the (sub-)system.

Configuration Parameters

The maximum number of iterations can be set to 20 for the URE as follows:

ExecutionLink
   SchemaNode "URE:maximum-iterations"
   ConceptNode "URE"
   NumberNode 20

This can be overwritten for PLN and all of the PLN subsystems as follows

ExecutionLink
   SchemaNode "URE:maximum-iterations"
   ConceptNode "PLN"
   NumberNode 2000

Defining Rules

A rule is defined as follows:

DefineLink
   <rule-alias>
   <rule-body>

where <rule-alias> is

DefinedSchemaNode <rule-name>

and <rule-body> is

BindLink
   <variables>
   AndLink
      <clause-1>
      ...
      <clauses-n>
   <conclusion>

where <variables> is a variable or list thereof (it is strongly advised to type each variable so that the URE doesn't get trapped into infinite recursions, see TypedVariableLink), <clauses-i> is either a pattern to match or a precondition (a virtual clause).

and <conclusion> is either a pattern or a formula call (note that in order to be compatible with the backward chainer it must be a formula call, see the related issue)

ExecutionOutputLink
   <formula>
   <arguments>

where <arguments>

is either

  1. a ListLink, in such a case the first argument represents the conclusion pattern and the following ones the premises (note that one can wrap the premises in a SetLink in case the formula is symmetrical, this may indeed speed up the backward chainer),
  2. or something else, in such case it represents the conclusion.

Also it is highly recommended to have the formula's premises being optional, using for instance in scheme

(define (formula (conclusion . premises) ...)

that way in case some premises are missing (which can happen if formula calls are nested and some results turn out to be undefined) no exception will be raised by the pattern matcher which will speed up formula application (because processing exceptions is deadly slow).

Adding rules to a ruleset

Rules are added to a ruleset by declaring them as members. Rules must be named to be added to a ruleset. For example:

MemberLink
    DefinedSchemaNode "my-rule"
    ConceptNode "PLN"

The truth-value on the MemberLink may be set, to define a preference for the usage of the rule. Semantically it represents the probability that a rule will produce the desire outcome. Uncertainty is taken into account so be careful how you set the confidence. The default TV (stv 1 0) will be used by default, the null confidence will have as consequence that the rule will be picked according to a uniform distribution.

The reason we want to use

DefinedSchemaNode "my-rule"

as opposed to the rule itself (the BindLink) is to store the rule name in the AtomSpace. This is convenient to create more human-readable inference traces.

There exists an scheme function `ure-add-rules` to easily define a rule set, for instance

(ure-add-rules my-rbs (list rule-1 ... rule-n))

will produce

MemberLink
    <rule-1>
    <my-rbs>
...
MemberLink
    <rule-n>
    <my-rbs>

If you which to associate TVs to the rules you may use pairs of rule and TV instead

(ure-add-rules my-rbs (list (list rule-1 tv-1) ... (list rule-n tv-n)))

which will produce

MemberLink <tv-1>
    <rule-1>
    <my-rbs>
...
MemberLink <tv-n>
    <rule-n>
    <my-rbs>

Control Policy

Operation of the chainers is controlled by several parameters, including configuring a fitness function for selecting sources (for the forward chainer), targets (for the backward chainer), rules, specifying a termination criterion, and breath vs depth search.

Fitness Function

There are at least 2 fitness functions involved:

  1. Fitness for choosing the next source (forward chainer)
  2. Fitness for choosing the next target (backward chainer)
  3. Fitness for choosing the next rule given a certain source or target

Fitness for choosing the next source

Currently in the code, the fitness for choosing the next source or target is hardwired in URECommons::tv_fitness. This will have to be addressed and this section updated accordingly.

Fitness for choosing the next target

These are not exposed as parameters yet, the default one is based on confidence, the target with the least confidence will be choosen first, since the default BC goal is to maximize confidence.

Fitness for choosing the next rule

The easiest way to control rule choice is by their associated TVs on the MemberLink. For instance

MemberLink <0.1 0.01>
   <PLN-modus-ponens-name>
   ConceptNode "PLN"

MemberLink <0.2 0.01>
   <PLN-deduction-rule-name>
   ConceptNode "PLN"

would indicate that the deduction rule has a probability of 0.2 of producing the desire outcome, thus twice more than the modus ponens rule. The confidence of 0.01 allows greater exploration, if it were 1, then given the choice between these 2 rules, the deduction rule would always be picked.

TODO: further control is possible be specifying inference control rule, documentation is still in the making.

Termination Criteria

At this time, the only stopping criteria is the number of steps. The parameters are stored in ExecutionLinks. These take the form:

ExecutionLink
   SchemaNode "URE:maximum-iterations"
   <rule-base>
   <max-iterations>

Boolean criteria are represented with EvaluationLinks, taking the form

EvaluationLink <TV>
  PredicateNode "URE:attention-allocation"
  <rule-base>

If TV.strength is > 0.5, then it indicates that attention allocation is enabled.

Breath vs Depth Search

The backward chainer allows to control the degree of breath and depth search via a complexity penalty parameter. The higher the complexity penalty, the closer it is to a breath first search. The lower the complexity penalty, the closer it is to a depth first search.

ExecutionLink
   SchemaNode "URE:BC:complexity-penalty"
   <rule-base>
   <cpx>

cpx ranges from 0 to +inf.

Back-Inference-Tree Reduction

The backward chainer allows to control the grows of the Back-Inference Tree (BIT). The following parameter will only allow it to grow till a certain size, once this since reached, portion of the BIT will be trimmed based on the likelihood of being expanded, that is portions that are the least likely to be expanded will be removed first.

ExecutionLink
   SchemaNode "URE:BC:maximum-bit-size"
   <rule-base>
   <size>

If size is negative or null then the BIT can grow without limit.

User Defined Control Policy

Ultimately, we need to allow the user to define their own control policy (without having to fiddle with the C++ URE code). One possible way to do this would be to have the fitness functions and termination criteria being user re-definable. We would need to provide enough access functions to expose all relevant knowledge to control an inference, like it previous N steps, etc. Then this might be just expressive enough to let the user define any control policy he/she wants, just as macros (applying a sequence of rules in a certain order), mutual exclusivity, etc.

Usage

So far, there are two functions provided by the URE, the forward and the backward chainer, called, respectively, cog-fc and cog-bc. Before invoking either of them, you should have defined and loaded the rule-base, its rules and configuration parameters, inside the atomspace, as described above in this page.

To use the chainers, you need to pass in argument of cog-fc or cog-bc

  1. The rule-base.
  2. The source (for cog-fc) or target (for cog-bc). You may wrap multiple sources in a SetLink. The empty SetLink considers the entire atomspace (or focus set) as sources, but will apply all rules at once. Another way to consider the entire atomspace (or focus set) as sources is to use a mere VariableNode, possibly typed but not necessarily.
  3. Optionally the variable declaration of the source/target. If you wish to use this argument add #:vardecl <my-vardecl> in the argument list.
  4. Optionally the focus set. If you wish to use this argument add #:focus-set <my-focus-set> in the argument list.

For more details and examples about the usage of cog-fc and cog-bc, see opencog/rule-engine#forward-chaining and opencog/rule-engine#backward-chaining respectively.