From OpenCog

The Management of Complex AtomSpaces

As we have seen, complex OCP configuration involves a collection of interacting "units". Most of these units are dynamic simple AtomSpaces. A handful, however, may be dynamic complex AtomSpaces themselves, containing simple AtomSpaces as components. So far we have discussed the way CIM-Dynamics act within a simple AtomSpace. Now we turn to some issues regarding the management of complex AtomSpaces. There are three sorts of issues here:

  • How do different Units and Unit-Groups interact?
  • How do the machines inside a multi-machine Unit manage the Unit-local distributed processing
  • How does the system overall regulate the distribution of resources among various Units and Unit-Groups?

Here we will give an overview of our treatment of these issues. Further details have been discussed and documented separately, but we omit them here, as these issues have more to do with distributed processing than with AI. This sort of work is not easy to do, but it is a well understood part of computer science, and while OCP's requirements as a distributed system are subtle, they are not outside the scope of the existing theory and practice of distributed systems design and implementation.

Control and Distributed Processing

In general, we may assume that a multi-Unit OCP is regulated by a process called a MultipartAtomspaceController or MAC, which is not an Unit but rather a piece of software designed to regulate sets of Units. There may sometimes be the need to introduce some non-Unit servers into a OCP, carrying out specialized auxiliary algorithms. These servers will also be regulated by the MAC.

One has the problem of physically starting up, shutting down and monitoring a bunch of machines in a Webmind. This is also done by the MAC. Note that the MAC in the current framework must regulate all the machines in the OCP, regardless of which Unit they're assigned to.

Within a distributed Unit, one has the problem of load balancing. This means that the Atoms in the Unit may be redistributed among the various machines assigned to the Unit, in order to maximize the effectiveness of the system. In practice one wishes to maximize the frequency with which CIM-Dynamics get to act, and minimize the amount of messaging between the machines in the Unit. This is an optimization problem which we have worked out in detail, but which we will not discuss here.

Distribution and Live Caching of Atoms in OCP

Since we can't fit all of OCP's Atoms in a single machine, and we want to minimize communication between machines, how do we handle the distribution of the Atoms? Currently, OCP uses a combination of global and local caching AtomSpaces — a design that is centered around the concept of a Mind DB.

The Mind DB based design is founded on a few simple assumptions:

  • OCP has multiple, specialized units.
  • Each unit carries out a number of CIM-Dynamics on an AtomSpace that contains a subset of the global Complex-Atomspace.
  • Some Atoms may be used in different units, by different CIM-Dynamics;

The latter assumption implies that different transformations may be effected upon these Atoms by the different CIM-Dynamics. This suggests that we need a special CIM-Dynamic to reconcile conflicting changes. The solution that most appealed to us was to create a specialized Unit which is responsible for being the central Atom storage. This is the Petaverse.MindDB.

The Petaverse.MindDB is then a Globally-Caching-AtomSpace and an associated set of CIM-Dynamics. These are the dynamics responsible for freezing and defrosting of Atoms (as examined in the previous section), and for the reconciliation of conflicting transformations on a single Atom (using PLN rules such as revision and the Rule of Choice), as well as the more mundane tasks of transaction control and data distribution. All the Atoms that should be in the AtomSpace available to the OCP mind as a whole are included in the Mind DB, in one of the three possible states already mentioned.

In order to minimize messaging between machines, one may want to introduce MindDBProxies in each machine. A MindDBProxy contains local copies of Atoms that live inside the Petaverse.MindDB. This is a way of dealing with the frequent situation where there are numerous Atoms that are heavily used by multiple CIM-Dynamics, across multiple Units. By keeping proxies of them distributed around, one minimizes the amount of inter-machine messaging significantly, thus speeding up the whole system.