DualLink

From OpenCog
Jump to: navigation, search

The DualLink is used to find the premises of rules, equivalently, to find patterns that match data. In a certain sense, it is dual to, or opposite to pattern matching. It implements a dynamic variant of the Rete aglorithm, and is meant to be used in the same situations where the Rete algorithm would be useful: to quickly find all of the rules that might be applicable to a given situation. For certain types of matching, such as that used in AIML, it organizes data similar to a trie.

Motivation

When designing a rule engine, one is (eventually) faced with the task of determining which rules can be applied at some given point of the calculations. One can blindly try all of the rules, and see which ones fire. When there are more than a few dozen rules, this becomes impractical. This issue was observed, and resolved in the 1970's and 1980's, with the Rete algorithm: one organizes the set of rules into a trie, which is then very easily and quickly walked, to determine which ones can fire.

OpenCog and the AtomSpace do NOT (explicitly) implement tries or Rete! However, the general (hyper-)graph structure of OpenCog Atoms already contains enough connectivity information to accomplish more or less the same thing: a kind-of "dynamic Rete", where rulesets can be searched for, on-demand, at runtime.

The core idea is that pattern recognition is dual to pattern matching. If we define "pattern matching" as the idea of finding all data that matches a pattern, then "pattern recognition" is the act of finding all patterns that match the data. In pattern matching, where a pattern has variables in it, one finds all groundings (concrete values for the variables) that cause the pattern expression to evaluate to "true". In pattern recognition, one has a single "grounding", and asks for all pattern expressions that evaluate to "true" when applied to this grounding.

The Atomese construct for accomplishing this is the DualLink. It is roughly the opposite of the GetLink, "opposite" in the cat-theory sense of reversing arrows. Its "rough", in that the type constraints that are possible in GetLink are ignored by the DualLink.

Examples

The example below is based on an AIML-like search, simply because this is easy to explain and demonstrate. Note that all AIML chatbots maintain a trie of AIML rules, and so AIML is a "natural" example of pattern recognition. The atomese DualLink is, however, a general pattern recognizer: it can be used in a general setting, not just for AIML-like structures.

If the atomspace contains this:

(BindLink
  (ListLink
     (Concept "I") (Glob "$star") (Concept "you"))
  (ListLink
     (Concept "I") (Glob "$star") (Concept "you") (Concept "too")))

Then the following:

(cog-execute! (DualLink
   (ListLink (Concept "I") (Concept "love") (Concept "you"))))

will return the premise of the BindLink:

(ListLink
     (Concept "I") (Glob "$star") (Concept "you"))

An extended example can be found in /examples/pattern-matcher/recognizer.scm.

UnorderedLink example

Works fine with UnorderedLinks as well. For example:

(UnorderedLink
    (ConceptNode "this")
    (GlobNode "$star")
    (ConceptNode "is"))

(UnorderedLink
    (GlobNode "$star")
    (ConceptNode "this")
    (ConceptNode "test"))

(UnorderedLink
    (VariableNode "$star")
    (ConceptNode "this")
    (ConceptNode "test"))

(UnorderedLink
    (VariableNode "$star-b")
    (VariableNode "$star-a")
    (ConceptNode "this")
    (ConceptNode "test"))

(cog-execute!  (Dual
(UnorderedLink
    (ConceptNode "is")
    (ConceptNode "test")
    (ConceptNode "this")
    (ConceptNode "a"))))

does exacatly what is expected: it finds both GlobNode rules, and just one f the VariableNode rules, the one with two variables in it. The single-variable rule cannot match, because the arity is incorrect.