GlobNode

From OpenCog
Jump to: navigation, search

The GlobNode is a type of VariableNode that can match multiple successive atoms during pattern matching. A normal variable node can only match a single atom. See glob on wikipedia for a definition of globbing.

Example

The pattern immediately below, re-writes "I * you" to "I * you too".

  (BindLink
  (ListLink
     (ConceptNode "I")
     (GlobNode "$star")
     (ConceptNode "you"))
  (ListLink
     (ConceptNode "I")
     (GlobNode "$star")
     (ConceptNode "you")
     (ConceptNode "too")))

When applied to this:

(ListLink
  (ConceptNode "I")
  (ConceptNode "really")
  (ConceptNode "totally")
  (ConceptNode "need")
  (ConceptNode "you"))

it will produce the output

(ListLink
  (ConceptNode "I")
  (ConceptNode "really")
  (ConceptNode "totally")
  (ConceptNode "need")
  (ConceptNode "you")
  (ConceptNode "too"))

Typed globs

The current, default implementation of the GlobNode matches one or more sequential atoms in a list. However, there are plausible use cases where one may want to match zero or more times, or match no more than N times. This section describes an unimplemented proposal for how this could be done.

The core insight of the proposal is to use the TypedVariableLink, and all of its accompanying features, to specify how the GlobNode should work, and what it should match.

The example below specifies a Glob that must be matched at least twice, but no more than three times:

TypedVariableLink
    GlobNode   "$foo"
    IntervalLink
        NumberNode  2
        NumberNode  3

It makes use of the IntervalLink to specify a numeric interval. This can be used with the usual type specification mechanism. Thus,

TypedVariableLink
    GlobNode   "$foo"
    IntervalLink
        NumberNode  2
        NumberNode  3
    TypeNode "ConceptNode"

indicates that either two or three matches must be made, and the matching type must be ConceptNode. In place of TypeNode here, it should also be possible to use TypeChoiceLink, SignatureLink, and so on.

Typing Overkill

By introducing one more atom type, the TypeSetLink, one could get a full general solution; but it seems like overkill. So for example:

TypedVariableLink
    GlobNode   "$foo"
    TypeChoiceLink
        TypeSetLink
            IntervalLink
                NumberNode  2
                NumberNode  3
            TypeNode "ConceptNode"
        TypeSetLink
            IntervalLink
                NumberNode  7
                NumberNode  8
            TypeNode "PredicateNode"

Which states that there coukld be either 2 or 3 matches to ConceptNode, or 7 or 8 matches to PredicateNode. The TypeChoiceLink exists already; the TypeSetLink is new. One could use a SetLink here, instead of a TypeSetLink, except that creating a new distinct type could help make such expressions slightly more readable.