OpenCogPrime:AtomNotation

From OpenCog
Jump to: navigation, search

Denoting Atoms

Atoms are the basic objects making up OpenCog knowledge. They come in various types, and are associated with various dynamics, which are embodied in MindAgents. Generally speaking Atoms are endowed with TruthValue and AttentionValue objects. They also sometimes have names, and other associated Values as previously discussed. In the following subsections we will explain how these are notated, and then discuss specific notations for Links and Nodes, the two types of Atoms in the system.

Names

In order to denote an Atom in discussion, we have to call it something. Relatedly but separately, Atoms may also have names within the OpenCog system. (As a matter of implementation, in the current OpenCog version, no Links have names; whereas, all Nodes have names, but some Nodes have a null name, which is conceptually the same as not having a name.)

(name,type) pairs must be considered as unique within each Unit within a OpenCog system, otherwise they can't be used effectively to reference Atoms. It's OK if two different OpenCog Units both have SchemaNodes named +, but not if one OpenCog Unit has two SchemaNodes both named + — this latter situation is disallowed on the software level, and is assumed in discussions not to occur.

Some Atoms have natural names. For instance, the SchemaNode corresponding to the elementary schema function + may quite naturally be named +. The NumberNode corresponding to the number .5 may naturally be named .5, and the CharacterNode corresponding to the character c may naturally be named c. These cases are the minority, however. For instance, a SpecificEntityNode representing a particular instance of + has no natural name, nor does a SpecificEntityNode representing a particular instance of c.

Names should not be confused with Handles. Atoms have Handles, which are unique identifiers (in practice, numbers) assigned to them by the OpenCog core system; and these Handles are how Atoms are referenced internally, within OpenCog, nearly all the time. Accessing of Atoms by name is a special case — not all Atoms have names, but all Atoms have Handles. An example of accessing an Atom by name is looking up the CharacterNode representing the letter "c" by its name "c". There would then be two possible representations for the word "cat":

  1. this word might be associated with a ListLink — and the ListLink corresponding to "cat" would be a list of the Handles of the Atoms of the nodes named "c", "a", and "t".
  2. for expedience, the word might be associated with a WordNode named "cat."

In the case where an Atom has multiple versions, each version has a VersionHandle, so that accessing an AtomVersion requires specifying an AtomHandle plus a VersionHandle. More on Handles will be presented in the chapter on the MindOS.

OpenCog never assigns Atoms names on its own; in fact, Atom names are assigned only in the two sorts of cases just mentioned:

  1. Via preprocessing of perceptual inputs (e.g. the names of NumberNode, CharacterNodes)
  2. Via hard-wiring of names for SchemaNodes and PredicateNodes corresponding to built-in elementary schema (e.g. +, AND, Say)

If an Atom A has a name n in the system, we may write

A.name = n

On the other hand, if we want to assign an Atom an external name, we may make a meta-language assertion such as

L1 := (InheritanceLink Ben animal)

indicating that we decided to name that link L1 for our discussions, even though inside OpenCog it has no name.

In denoting nameless Atoms we may use arbitrary names like L1. This is more convenient than using a Handle based notation which Atoms would be referred to as [1], [3433322], etc.; but sometimes we will use the Handle notation as well.

Some ConceptNodes and conceptual PredicateNode or SchemaNodes may correspond with human-language words or phrases like cat, bite, and so forth. This will be the minority case; more such nodes will correspond to parts of human-language concepts or fuzzy collections of human-language concepts. In discussions in this wikibook, however, we will often invoke the unusual case in which Atoms correspond to individual human-language concepts. This is because such examples are the easiest ones to discuss intuitively. The preponderance of named Atoms in the examples in the wikibook implies no similar preponderance of named Atoms in the real OpenCog system. It is merely easier to talk about a hypothetical Atom named "cat" than it is about a hypothetical Atom (internally) named [434]. It is not impossible that a OpenCog system represents "cat" as a single ConceptNode, but it is just as likely that it will represent "cat" as a map composed of many different nodes without any of these having natural names. Each OpenCog works out for itself, implicitly, which concepts to represent as single Atoms and which in distributed fashion.

For another example,

ListLink
    CharacterNode "c"
    CharacterNode "a"
    CharacterNode "t"

corresponds to the character string

("c", "a", "t")

and would naturally be named using the string cat. In the system itself, however, this ListLink need not have any name.

Types

Atoms also have types. When it is necessary to explicitly indicate the type of an atom, we will use the keyword Type, as in

A.Type = InheritanceLink

N_345.Type = ConceptNode

On the other hand, there is also a built-in schema HasType which lets us say

EvaluationLink HasType A InheritanceLink

EvaluationLink HasType N_345 ConceptNode

This covers the case in which type evaluation occurs explicitly in the system, which is useful if the system is analyzing its own emergent structures and dynamics.

Truth Values

The truth value of an atom is a bundle of information describing how true the Atom is, in one of several different senses depending on the Atom type. It is encased in a TruthValue object associated with the Atom. Most of the time, we will denote the truth value of an atom in <>'s following the expression denoting the atom. This very handy notation may be used in several different ways.

A complication is that some Atoms may have CompositeTruthValues, which consist of different estimates of their truth value made by different sources, which for whatever reason have not been reconciled (maybe no process has gotten around to reconciling them, maybe they correspond to different truth values in different contexts and thus logically need to remain separate, maybe their reconciliation is being delayed pending accumulation of more evidence, etc.). In this case we can still assume that an Atom has a default truth value, which corresponds to the highest-confidence truth value that it has, in the Universal Context.

Most frequently, the notation is used with a single number in the brackets, e.g.

A <.4>

to indicate that the atom A has truth value .4; or

IntensionalInheritanceLink Ben monster <.5>

to indicate that the IntensionalInheritance relation between Ben and monster has truth value strength .5. In this case, <tv> indicates (roughly speaking) that the truth value of the atom in question involves a probability distribution with a mean of tv. The precise semantics of the strength values associated with OpenCog Atoms is described in Probabilistic Logic Networks. Please note, though: This notation does not imply that the only data retained in the system about the distribution is the single number .5.

If we want to refer to the truth value of an Atom in the context C, we can use the construct

ContextLink
  C
  A <_truth value_>

Sometimes, Atoms in OpenCog are labeled with two truth value components as defined by PLN: strength and weight-of-evidence. To denote these two components, we might write

IntensionalInheritanceLink Ben scary <.9,.1>

indicating that there is a relatively small amount of evidence in favor of the proposition that Ben is very scary.

If we want to denote a composite truth value (whose components correspond to different "versions" of the Atom), we can use a list notation, e.g.

IntensionalInheritanceLink Ben scary (<.9,.1>, <.5,.9> [h,123],<.6,.7> [c,655])

where e.g.

<.5,.9> [h,123]

denotes the TruthValue version of the Atom indexed by Handle [123]. The _h_ denotes that the AtomVersion indicated by the VersionHandle [h,123] is a Hypothetical Atom, in the sense described in the PLN book. Some versions may not have any index Handles.

The semantics of composite TruthValues are described in the PLN book, but roughly they are as follows. Any version not indexed by a VersionHandle is a "primary TruthValue" that gives the truth value of the Atom based on some body of evidence. A version indexed by a VersionHandle is either contextual or hypothetical, as indicated notationally by the c or h in its VersionHandle. So, for instance, if a TruthValue version for Atom A has VersionHandle [h,123] that means it denotes the truth value of Atom A under the hypothetical context represented by the Atom with handle [123]. If a TruthValue version for Atom A has VersionHandle [c,655] this means it denotes the truth value of Atom A in the context represented by the Atom with Handle [655].

Alternately, truth values may be expressed sometimes in <L,U,b> or <L,U,b,N> format, defined in terms of indefinite probability theory as defined in the PLN book. For instance,

IntensionalInheritanceLink Ben scary <.7,.9,.8,20>

has the semantics that There is an estimated 80% chance that after 20 more observations have been made, the estimated strength of the link will be in the interval (.7,.9).

Sometimes we will put the TruthValue indicator in a different place, e.g. using indent notation,
IntensionalInheritanceLink <.5>
    Ben
    Monster

This is mostly useful when dealing with long and complicated constructions — which is however a fairly common case....

The notation may also be used to specify a TruthValue probability distribution, e.g.

A < g(.5,7,12)>

would indicate that the truth value of A is given by distribution g with parameters (5,7,12), or

A < M >

where M is a table of numbers would indicate that the truth value of A is approximated by the table M.

The <> notation for truth value is an unabashedly incomplete and ambiguous notation, but it is very convenient. If we want to specify, say, that the truth value strength of IntensionalInheritanceLink Ben monster is in fact the number .5, and no other truth value information is retained in the system, then we need to say

(IntensionalInheritanceLink Ben monster).TruthValue = [(strength, .5)]

(where a hashtable form is assumed for TruthValue objects, i.e. a list of name-value pairs). But this kind of issue will rarely arise and the <> notation will serve us very well.

We will also sometimes use the object-field style with TruthValues; for instance, if tv is a TruthValue object, tv.s means the strength field of tv.

Attention Values

The AttentionValue object associated with an Atom does not need to be notated nearly as often as truth value. When it does however we can use similar notational methods.

AttentionValues may have several components, but the two critical ones are called short-term importance (STI) and long-term importance (LTI). Furthermore, multiple STI values are retained: for each (Atom, MindAgent) pair there may be a Mind-Agent-specific STI value for that Atom. The pragmatic import of these values will become clear in a later chapter when we discuss attention allocation.

Roughly speaking, the long-term importance is used to control memory usage: when memory gets scarce, the atoms with the lowest LTI value are removed. On the other hand, the short-term importance is used to control processor time allocation: MindAgents, when they decide which Atoms to act on, will generally choose the ones that have proved most useful to them in the recent past, and additionally those that have been useful for other MindAgents in the recent past.

We will use the double bracket <<>> to denote attention value (in the rare cases where such denotation is necessary). So, for instance,

Cow_7 <<.5>>

will mean the node Cow_7 has an importance of .5; whereas,

Cow_7 <<STI=.1, LTI = .8>>

will mean the node Cow_7 has short-term importance = .1 and long-term importance = .8 .

Of course, we can also use the style

(IntensionalInheritanceLink Ben monster). AttentionValue = [(STI,.1), (LTI, 4)]

where appropriate.

Generally, specific denotations of AttentionValues come up much less often in discussion than specific denotations of TruthValues.

Links

Links are represented using a simple notation that has already occurred many times in this book. For instance,

Inheritance A B

Similarity A B

Note that here the symmetry or otherwise of the link is not implicit in the notation. SimilarityLinks are symmetrical, InheritanceLinks are not. When this distinction is necessary, it will be explicitly made.