User-defined index

From OpenCog
Jump to: navigation, search

The AtomSpace is essentially a certain kind of graphical database. Like other databases, it should allow users to define the kinds of structures that they neeed to be able to quickly locate. This could be done with a user-defined index..

The AtomTable currently has some hand-crafted indexes for quickly locating all atoms of some particular type, or having some particular property. The idea is to remove these and replace them by a generalized, user-defined index. (see graph of existing indexes).

The purpose of an index is to very quickly find an atom of some particular type or kind, without having to trawl over the whole atomspace to find it -- one just asks for all atoms in the index. To make this work, when an atom is added to the atomtable, it is also added to any index which it might match.

A general purpose extension would be to allow user-defined indexes. The "user" provides an index name, and a template pattern. Whenever an atom is added to the atom table, we check to see if it matches the template. If so, its added to the named index. When I say "atom" here, I really mean "maybe a whole graph of stuff": the template might be some fairly complicated hypergraph: the template might say "there must be two nested links of type X and Y, and these nodes here could be any type, but that there must be type Z", etc.

For instance, we might have many links in the AtomTable matching the "atom structure template"

Pred($X, $Y) =
BindLink $X $Y
AND
___ EvaluationLink isHuman $X
___ EvaluationLink 
________ livesIn 
________ List $X $Y

... and then we might want a special index that basically indexes "where people live". So, you could look up any location, and find who lives there ...

The hardest part of implementing this is already done: the pattern matcher can already match on any arbitrary hypergraph structure: so, we'd invoke it on any incoming atoms, to see if they match the desired template. User:Linas will tutor anyone on how to use this, or write a little convenience wrapper, just to make it easy.

We still need to define a way of doing type checking: that is, of specifying the kinds of graphs that will be admitted into the index.

In fancier language: the pattern matcher provides a Structured Query interface to the atomspace (vaguely similar to SQL or to SPARQL). In functional programming, an index is called memoization. Alternately, they can be understood as Sinot's Director strings for OpenCog's term algebra.

Who needs this? Why should we bother?:

  • The embodiment code could probably use this (to be confirmed). It already uses the pattern matcher to filter input from sensors.
  • NLP/chatbot processing could use this (but there are no definite plans at this time).