ProtoAtom

From OpenCog
(Redirected from Atom type)
Jump to: navigation, search

The ProtoAtom is the base type for both Atoms and Values, including TruthValues, FloatValues, StringValues and LinkValues. Values provide a mechanism for associating arbitrary, generic values with some given atom. They can be though of as a per-atom key-value store (a per-atom noSQL database), or, equivalently, as the mathematical concept of a valuation.

Motivation

The distinction between Atoms and Values is made in order to provide users with two very different and distinct ways of storing information. Atoms provide a general way of encoding graphs and graphical information, with powerful tools to perform searches of the graphs, and to manipulate and transform them in various ways. This power comes with a performance cost: Atoms are maintained in an index (the AtomSpace), are globally unique, are immutable, and the implementation is fat and bulky, in order to enable fast search and graph traversal.

By contrast, Values are meant to provide a light-weight and fast mechanism for tagging Atoms with additional, arbitrary, information. Values are highly mutable. Values are meant to store rapidly changing, fleeting data. Thus, for example, the probability of the truth of some proposition (stored as a Value) will change over time, as new evidence is accumulated. The proposition itself, stored as an Atom, will not change.

The speed provided by Values comes with a penalty: they are not indexed or globally searchable. The only way to know a Value is to know both the Atom that it is hanging on, and the key under which it is filed; thus, Values are not globally searchable. One can, of course, examine all of the Values attached on a particular, given Atom. However, one cannot find all Values stored under some given key, because there is no way to search for all Atoms that use that key.

Thus, by providing two very distinct styles of representing and storing data, with two very different performance profiles, it is hoped that the representation of complex data becomes easier.

Terminology

The word "Atom" comes from the idea of an "atomic sentence", in formal logic. Atoms are more-or-less the same thing as the "terms" of "term algebra".

The word "Value" comes from the concept of "valuation" in formal logic and model theory. So, in model theory, a valuation is an assignment of truth values to each and every term; valuations indicate which terms are true, and which are false.

The design goal here is to allow multiple, different valuations at the same time (indexed by the "valuation key"), while also generalizing valuations from binary true/false values to Bayesian or frequentist probabilities (floating point numbers) or any kind of more general value (thus, a general-purpose key-value store).

Implementation Status

The current implementation status is tracked in github bug #513. The implementation is mostly completed, however, general utilities, examples, use-cases and documentation is still missing. The range of programming styles that use values has not yet been explored. The current C++ implementation is in opencog/atoms/base.

ProtoAtoms (that is, both atoms and values) can be stored in a persistent store (currently supported only in the Postgres SQL backend).

Type hierarchy

ProtoAtoms are like atoms, except that they lack a TV, an AV, an incoming set, and cannot be stored in the atomspace. This makes them smaller, lighter and more efficient. They were invented to provide a common base class for TruthValues and AttentionValues, thus generalizing these, and to provide a common type hierarchy, so that the existing type specification system could be used to specify both atoms and values.

The current type-inheritance hierarchy is as follows, copied from opencog/atoms/base/atom_types.script:

// Special type designating that no atom type has been assigned.
NOTYPE

VALUE
FLOAT_VALUE <- VALUE    // vector of floats, actually.
STRING_VALUE <- VALUE
LINK_VALUE <- VALUE     // vector of values ("link" holding values)
VALUATION <- VALUE

// All of the different flavors of truth values
TRUTH_VALUE <- FLOAT_VALUE
SIMPLE_TRUTH_VALUE <- TRUTH_VALUE
COUNT_TRUTH_VALUE <- TRUTH_VALUE
INDEFINITE_TRUTH_VALUE <- TRUTH_VALUE
FUZZY_TRUTH_VALUE <- TRUTH_VALUE
PROBABILISTIC_TRUTH_VALUE <- TRUTH_VALUE
EVIDENCE_COUNT_TRUTH_VALUE <- TRUTH_VALUE

// The AttentionValue
ATTENTION_VALUE <- FLOAT_VALUE

// Base of hierarchy - NOTE: ATOM will not have a corresponding Python
// construction function to avoid identifier conflict with the Atom object.
ATOM <- VALUE
NODE <- ATOM
LINK <- ATOM

CONCEPT_NODE <- NODE
NUMBER_NODE <- NODE

// Basic Links
ORDERED_LINK <- LINK
UNORDERED_LINK <- LINK

... and so on.

The type hierarchy has the algebraic properties of the mathematical concepts of lattice or order. Points in the lattice can be specified with the TypeNode, while upper and lower bounds are specified with the TypeInhNode and the TypeCoInhNode.