GlobNode
The GlobNode is a type of VariableNode that can match multiple successive atoms during pattern matching. A normal variable node can only match a single atom. See glob on wikipedia for a definition of globbing.
Contents
Example
The pattern immediately below, re-writes "I * you" to "I * you too".
(BindLink (ListLink (ConceptNode "I") (GlobNode "$star") (ConceptNode "you")) (ListLink (ConceptNode "I") (GlobNode "$star") (ConceptNode "you") (ConceptNode "too")))
When applied to this:
(ListLink (ConceptNode "I") (ConceptNode "really") (ConceptNode "totally") (ConceptNode "need") (ConceptNode "you"))
it will produce the output
(ListLink (ConceptNode "I") (ConceptNode "really") (ConceptNode "totally") (ConceptNode "need") (ConceptNode "you") (ConceptNode "too"))
Typed globs
By default, GlobNode matches one or more sequential atoms in a list. One may want to match zero or more times, or match no more than N times. This can be accomplished with the TypedVariableLink, together with IntervalLink.
The example below specifies a Glob that must be matched at least twice, but no more than three times:
TypedVariable Glob "$foo" Interval Number 2 Number 3
It makes use of the IntervalLink to specify a numeric interval. This can be used with the usual type specification mechanism. Thus,
TypedVariable Glob "$foo" TypeSet Interval Number 2 Number 3 Type "ConceptNode"
indicates that either two or three matches must be made, and the matching type must be ConceptNode. In place of TypeNode here, it should also be possible to use TypeChoiceLink, SignatureLink, and so on.
Matching an unbounded number of items can be specified by using a negative upper bound, like so:
Interval Number 2 Number -1
This specifies a match of 2 or more times, with no upper bound on the number of matches.
Specifying multiple constraints
By using the TypeSetLink, one can specify a general set of typing constraints. So for example:
TypedVariable Glob "$foo" TypeChoice TypeSet Interval Number 2 Number 3 Type "ConceptNode" TypeSet Interval Number 7 Number 8 Type "PredicateNode"
Which states that there could be either 2 or 3 matches to ConceptNode, or 7 or 8 matches to PredicateNode. The TypeSetLink is used instead of SetLink, to make it clear that its not just any set, but a set of type specifications. Hopefully, that makes it easier to read and understand such expressions. (This might not be implemented, yet. If it is, it might not be tested ... it might be broken!)
Variables as GlobNodes
Proposal: the use of intervals should apply to variables, as well as globs. Thus, one should be allowed to write:
TypedVariableLink VariableNode "$foo" Interval Number 2 Number 42
which would behave exactly the same way as if the variable was declared as GlobNode "$foo"
. This is not implemented yet.
If the above was implemented, then the only distinction between variables and globs would be that, by default, variables always match once and only once, while globs, by default, match one or more times.
Greedy vs. lazy matching
Unlike regex globbing, the pattern matcher does not distinguish between greedy and lazy matching; instead, it explores all possible groundings of globs. This is consistent with how other parts of the pattern matcher work: all possible permutations of unordered links are explored; all possible choices of choice links are explored, etc.