AtomSpace
From OpenCog
The AtomSpace is an interface for the manipulation and storage of hypergraphs, and is the central knowledge representation system provided by the OpenCog Framework. The AtomSpace provides a generic interface for creating and deleting hypergraphs of Atoms (the superclass for Nodes and Links).
Conceptually, knowledge in OpenCog is stored within large [weighted, labeled] hypergraphs with nodes and links linked together to represent knowledge. This is done on two levels: Information primitives are symbolized in individual or small sets of nodes/links, and patterns of relationships or activity found in [potentially] overlapping and nesting networks of nodes and links. (OCP tutorial log #2).
Once created, a hypergraph is immutable; it can only be deleted. Atom properties (such as their TruthValue or STI), however, are changeable at any time. The AtomSpace has other important design aspects, reviewed below. The primary, driving requirement is to efficiently support induction, reasoning, neural nets and other graphical algorithms with the best possible performance and flexibility. Essentially, the AtomSpace must offer an efficient substrate for Self-Modifying Evolving Probabilistic Hypergraphs.
The contents of the atomspace can be visualized using the atomspace visualizer.
Contents |
Hypergraph Database
Put more bluntly, the AtomSpace is nothing more than a container for storing (hyper-)graphs. It is optimized so that one can implement (probabilistic/fuzzy) inference-engines/theorem-provers, etc., such as PLN, on top of it. To achieve this, the Atomspace must have some database-like features:
- It must be queriable for all occurrences of certain hypergraphs, i.e. one must be able to perform pattern-matching against arbitrary query patterns.
- It must maintain user-defined indexes of certain common pattern types, using e.g. the RETE algorithm, and/or other database-like indexing systems.
The AtomSpace implementation has some additional requirements:
- Perform queries as fast as possible.
- Be thread-safe.
- Hold hypergraphs consisting of billions or trillions (or more) nodes/links; scale to petabytes (or more).
- Save & restore hypergraphs to media, such as disk, a more traditional SQL or non-SQL database, or other system (e.g. flat files, XML, etc.).
- Exchange, query and synchronize hypergraphs with other, network-remote atomspaces or servers, in a manner as fast as possible, while maintaining as much coherency as possible.
The current implementation fails on many of these requirements, especially in scalability; there is much room for improvement. Nonetheless, a basic system exists. The following is currently possible:
- Atomspace contents can be saved/restored as s-expressions (i.e. scheme), as XML(deprecated), and in an SQL database.
- The SQL storage backend allows a weak form of dynamic, on-demand loading of hypergraphs into the Atomspace.
- A generic pattern matcher has been implemented. Queries may be specified as hypergraphs themselves, using ImplicationLink and BindLink. Procedures may be triggered using ExecutionLink. Low-level access to the pattern matcher is possible by coding in C++ (discouraged! except for engine optimization work), scheme or (coming soon!) python.
- A half-dozen hard-coded indexes are kept by the atomspace. User-defined indexes are not yet supported.
- Atomspace contents may be viewed through a web interface.
Nodes and Links
There is a class for Node and another for Link as well as a common superclass Atom. So we refer to the AtomSpace as the set of nodes and links which represents the knowledge stored in an OpenCog database. Nodes are representation of entities in general, while Links are representation of some relationship among two or more Atoms (Links may link Nodes to Nodes, Nodes to Links or Links to Links).
Handles and the TLB
Atoms are uniquely identified by a Handle. Pointers to Atoms should never be kept in any data structure other than the AtomTable. All references to atoms should proceed through handles via the AtomSpace API.
The purpose of the translation mechanism is to allow Atoms to be stored on disk, or even another machine, rather than always kept in RAM. Thus, an Atom is fetched into local memory only when it is actually needed. See opencog/atomspace/BackingStore.h for details, and opencog/persist/README for a persistence implementation.
