OpenCog As OCP's Mind OS
The foci of this OpenCogPrime wikibook are the OCP cognitive architecture and dynamics, and the general OCP software architecture. By and large, in these pages, we are intentionally avoiding discussion of OCP implementation details, which form a long, complex and subtle story in themselves. However, one lesson that we've learned in our years of experience designing and building large-scale AGI-oriented software systems like Webmind and OCP is that the mathematical/conceptual and implementation levels cannot be separated nearly as thoroughly as one might like. Just as the human mind is strongly influenced by the nature of its underlying wetware, OCP is strongly influenced by the nature of its underlying hardware and operating system. In this page and its children we review some aspects of the OCP software architecture that it seems really can't be avoided without making the conceptual material about OCP excessively vague and opaque.
Layers Intervening Between Source Code and AI Mind
There are multiple layers intervening between a body of AI source code and a conceptual theory of mind. How many layers to explicitly discuss is a somewhat arbitrary decision, but one way to picture it would be as in the following table:
|Level of Abstraction||Description/Example|
|2||Detailed software design|
|3||High-level software design/ architecture||largely programming-language-independent, but not hardware-architecture-independent: much of the material in this chapter, for example ... most of the OpenCog Framework|
|4||Mathematical/conceptual AI design||e.g., the sort of characterization of OCP given in most of this book|
|5||Abstract mathematical modeling of cognition||e.g. the SMEPH model, which could be used to inspire or describe many different AI systems|
|6||Philosophy of mind||e.g. the Psynet Model of Mind, patternism, etc.|
Some High-level Implementation Issues
This chapter describes the basic architecture of OCP as a software system, implemented within the OpenCog Framework (OCF). We describe here those aspects of the OpenCog Framework that are most essential to OpenCog Prime. The OpenCog Framework forms a bridge between the mathematical structures and dynamics of OCP's concretely implemented mind, and the nitty-gritty realities of modern computer technology. All the Atoms and MindAgents of OCP live within the OpenCog Framework , so a qualitative understanding of the nature of the OCF is fairly necessary for an understanding of how the system works; and of course a detailed understanding of the OCF is necessary for doing concrete implementation work on OCP.
The nature of the OCF is strongly influenced by the quantitative requirements imposed on the system, as well as the general nature of the structure and dynamics that it must support. The large number and great diversity of Atoms needed to create a significantly intelligent OCP, demands that we pay careful attention to such issues as multithreading, distributed processing, and scalability in general. The number of Nodes and Links that we will need in order to create a reasonably complete OCP is still largely unknown. But our experiments with learning, natural language processing, and cognition over the past few years have given us an intuition for the question. We believe that we are likely to need billions — but probably not trillions, and almost surely not quadrillions — of Atoms in order to achieve a high degree of general intelligence. Hundreds of millions strikes us as possible but overly optimistic. In fact we have already run OCP systems utilizing hundreds of millions of Atoms, though in a simplified dynamical regime with only a couple MindAgents acting on most of them.
The ideal hardware platform for OCP would be a massively parallel hardware architecture, in which each Atom was given its own processor and memory. The closest thing created in the history of computing has been the Connection Machine (Hillis, 1986): a CM5 was once built with 64000 processors and local RAM for each processor. But even 64000 processors wouldn't be enough for a highly intelligent OCP to run in a fully parallelized manner, since we're sure we need more than 64000 Nodes.
Connection Machine style hardware seems to have perished in favor of more standard SMP machines. It is true that each year we see SMP machines with more and more processors on the market. However, the state of the art is still in the hundreds of processors range, many orders of magnitude from what would be necessary for a one Atom per processor OCP implementation. And while these top-end multiprocessor SMP server machines would be very nice for a many Atoms per processor OCP implementation such as we have now, they are also extremely expensive per unit of memory and processing power, compared with commodity hardware.
So, at the present time, technological and financial reasons have pushed us to implement the system using a relatively mundane and standard hardware architecture. OCP version 1 will most likely live on a network of high-end commodity SMP machines. These are machines with a few Gigabytes of RAM and a few processors, so a highly intelligent OCP requires a cluster of dozens and possibly hundreds or thousands of such machines (we think it's unlikely that tens of thousands will be required, and extremely unlikely that hundreds of thousands will be). Given this sort of architecture, we need effective ways to swap Atoms back and forth between disk and RAM, and carefully manage the allocation of processor time among the various CIM-Dynamics that demand it. The use of a widely-distributed network of weaker machines for peripheral processing is a serious possibility, and we have some detailed software designs addressing this option; but for the near future we believe that this can only be a valuable augmentation to core OCP processing, which must remain on a dedicated cluster.
Of course, the use of specialized hardware is also a viable possibility — but the point is, we are not counting on it. Hugo de Garis (de Garis and Korkin, 2002) has devised techniques for using FPGA's (Field-Programmable Gate Array) to implement very efficient evolutionary learning of 3D formal neural networks; and it may be that such techniques could be modified to yield efficient evolutionary learning of OCP procedures. More speculatively, it might be possible to use evolutionary quantum computing to accelerate OCP procedure learning. Or, in a different direction, Octiga Bay has created an intriguing supercomputer architecture that integrates both FPGA's and a special processor interconnect fabric that provides extremely rapid cross-processor networking — this might well make the overhead involved with distributed OCP processing considerably less. All these possibilities are exciting to envision, but the OCP architecture does not require any of them in order to be successful.
Marvin Minsky is on record [NOTE: it would be nice if someone could dig up the reference for this, I know I read such an interview somewhere!! — Ben G] expressing his belief that a human-level general intelligence could be implemented on a 486 PC, if we just knew the algorithm. We doubt this is the case, and it is certainly not the case for OCP. By current computing hardware standards, a OCP system is a considerable resource hog. And it will remain so for at least several years to come, considering Moore's Law.
It is one of the jobs of the OCF to manage the system's gluttonous behavior. It is the software layer that abstracts the real world efficiency compromises from the rest of the system — this is why we call it a "Mind OS": it provides services, rules, and protection to the Atoms and CIM-Dynamics that live on top of it, which are then allowed to ignore the software architecture they live on.
We will not discuss the coding details of the OCF here, but we will sketch a few of the more important particulars of the design. Our goal here is to illustrate some of the more implementation-oriented aspects of OCP in a platform and programming-language-neutral mathematical way, and to show that the implementation of these formalisms is both necessary and possible. We will address four major aspects of the engineering side of OCP development:
- Distribution of Atoms and CIM-Dynamics across multiple machines and functional units.
- Scheduling of processor time (or as we say, attention allocation) to CIM-Dynamics.
- Caching of unimportant Atoms to disk (freezing) and their reloading onto RAM upon demand (defrosting).
- Automatic control of the system's numerous parameters, to prevent it from crashing.
These concepts will be refined in the following two chapters, when we enumerate the system's core Dynamics and discuss multiple possible OCP Configurations.
What we will describe here is basically the OCF architecture as intended for OCP in its "human level AGI-capable version," with comments occasionally inserted in places where the current OCF/OCP software architecture differs from this (due to time-constraints during implementation or other short-term practical reasons). For instance, at time of writing (July 2008) the OCF implementation does not support distributed processing nor Atom-by-Atom freezing/defrosting. It has been coded so as to make extension to distributed processing reasonably simple (e.g. Links don't involve lists of Node objects, they involve lists of Node handles, where a handle is an object that may be resolved to a local Node or to a Node on a remote machine). It does do freezing/defrosting but in a cruder way than desirable, involving saving/loading large groups of Atoms at a time. Automated parameter adaptation is currently implemented in a relatively simplistic way; the Webmind AI Engine, a predecessor to the NCE, contained a full parameter adaptation subsystem very much along the lines described here.
Varieties of AtomSpace
In other pages reviewing hypergraph formalism and terminology (SMEPH, AtomFormalization), we introduced one kind of AtomSpace, which here will be called the Simple-Atomspace: simply, a container of Atoms. Conceptually, we don't need anything besides that. However, to accurately mathematically model what goes on in the MindOS (in a complex, distributed OCF instance), we need more refined and complicated spaces. We need to handle distributed processing, in which an AtomSpace may be spread among multiple units. And we need to handle caching, in which the Atoms in an AtomSpace may be in different states regarding their availability in the cache.
To deal with distributed processing, we must introduce further notions such as
(where ^k means the k'th power according to the Cartesian product).
A Multipart-AtomSpace is an AtomSpace that consists of a number of distinct parts. The definition allows for nesting — Multipart-Atomspaces that live inside Multipart-Atomspaces — but this nesting is only allowed to go on for a finite number of levels, because all sets are assumed finite. In practice we do not go beyond two levels of recursion, in the current OCP design. That is, we have a couple cases of Multipart-Atomspaces that contain Multipart-Atomspaces; but it goes no further.
In practice the separate parts of a Multipart-AtomSpace will usually be functionally specialized parts, devoted to one or another particular cognitive task.
where this structure has the following interpretation. A OCP that is capable of caching and reloading (freezing and defrosting) atoms may be considered, at any given time, as a triple
(live atoms, frozen proxy atoms, frozen atoms)
The frozen atoms are on disk, the live atoms are in RAM, and the frozen proxy atoms are shells living in RAM pointing to frozen atoms on disk, existing only so that live atoms can use them as handles to grab frozen atoms by.
Real AtomSpaces may combine the features of caching and distributed processing. There are two simple ways to do this, global caching where there is one cache for the whole distributed network, and local caching where each part of the system has its own cache. There are also more complex possible arrangements, in which some parts share a cache and others have their own, giving rise to many combinations. The two simpler cases are covered by:
Globally-Caching-Multipart-AtomSpace = (Multipart-AtomSpace x AtomSpace)^2 Locally-Caching-Multipart-AtomSpace = Union_k Caching-Atomspace^k
In general a Multipart-AtomSpace that has caches associated with various groupings of its parts may be called a Complex-Atomspace, a category that groups together the various possibilities mentioned above.
So we have a set of Atoms, distributed across machines, cached as necessary. Now how does this AtomSpace evolve?
In the Webmind AI Engine (a predecessor of the NCE), we gave Nodes and Links the freedom to receive attention directly from the CPU, and then decide what actions to take. This was consistent with Webmind's Multi-agent System architecture and design, and it was also satisfyingly close to the conceptual foundations of the patternist perspective on mind. However, it led to inefficiency as the number of actors in the system scaled up. In order to address this performance issue, in architecting OCP, we decided it was necessary to create a software system that deviated further from the structures naturally suggested by the philosophical foundations. This doesn't mean that the OCP architecture fails to represent the philosophical foundations; it means rather that the mapping from philosophy to software is a little more complex. This complexity is the price paid for efficient performance on contemporary commodity hardware.
In OCP, most of the Atoms are inert, in the sense that they don't contain pieces of code telling them how to act on other Atoms. (The exceptions are grounded SchemaNodes and PredicateNodes, which correspond to OpenCogPrime:ComboTrees living in the OpenCogPrime:ProcedureRepository; but in this case, there is a MindAgent that chooses nodes of this type and enacts their internal schemata/predicates for them) . Rather than cycling through Atoms and letting them explicitly act, as was done in Webmind, OCP dynamics works using dynamics objects called MindAgents. On each machine implementing part of a OCP AtomSpace, there is a set of MindAgents, which are repeatedly cycled through, each getting a certain amount of processor time, and each using its processor time to enact actions on behalf of certain selected Atoms. Most of the MindAgents in the software system represent CIM-Dynamics — dynamical processes that are concerned with modifying and creating Atoms in an AtomSpace — but there are also MindAgents that take care of purely system-level tasks rather than mind-ish tasks.
The interrelation between the conceptual/mathematical structure of AtomSpaces, and the machines implementing the AtomSpaces, is moderately subtle. A multipart AtomSpace, as defined above, refers to an AtomSpace that is conceptually divided into several different parts. But each of these different conceptual parts may be implemented on one or more machines. So, for instance, a simple AtomSpace may be running on one, three, or 47 different machines. A MindAgent running on a certain machine has free access to all Atoms within its containing simple AtomSpace. Some MindAgents may choose to primarily act on Atoms living in the same machine as they do. But if need be, they can freely act on Atoms living on other machines serving the same AtomSpace. On the other hand, if an Atom A is on a machine living in a different AtomSpace from MindAgent M, then it can't be assumed that M has the power to modify A. Interactions between different simple AtomSpaces in the same multipart AtomSpace are handled by the OpenCogPrime:MindDB design. Essentially, the MindDB represents a centralized database of Atoms, and mediates cases where different Units want to affect each others' Atoms, or independently contain Atoms that represent the same thing (i.e. are copies of each other).
So, when a CIM-Dynamic-embodying MindAgent is active, it selects Atoms (by some criterion, often importance) from the AtomSpace in which it lives, and acts on them, modifying their state and/or creating new Atoms based on them. For instance, there is one MindAgent that deals with schema execution, which selects SchemaInstanceNodes and allows them to enact the schema functions within themselves, thus transforming other Atoms or creating new ones. There is another that deals with first-order logical inference, inspecting logical links in the AtomSpace and creating new ones.
Processor time allocation among MindAgents is done based on a simple scheduling process — different types of action, embodied in different MindAgents, receive different slices of time, and then they can allocate their slices using different policies. We will return to this scheduling algorithm in OpenCogPrime:Scheduling.
We use the term OpenCogPrime:Unit to refer to a simple AtomSpace together with a collection of CIM-Dynamics (which in practice are implemented in MindAgents). Each Unit may potentially run on multiple machines; we use the term Lobe to refer to an individual software process that is part of a Unit. A Unit consists in general of multiple Lobes running in multiple software processes (generally on multiple machines), but sharing the same "distributed virtual AtomSpace." Most discussions regarding OCP cognition may be carried out on the level of Units rather than Lobes, but for instance the discussion of MindAgent scheduling in OpenCogPrime:Scheduling is an exception.
Next, we use the term OpenCogPrime:UnitGroup to refer to a multipart AtomSpace, each component of which is associated with some CIM-Dynamics, and which as a whole may also be associated with a collection of CIM-Dynamics. In theory a Unit group could be a grouping of groupings of groupings of ... groupings of AtomSpaces. In the current OCP design it never gets all that deep. For instance, we have the Evolutionary Programming Unit-Group, which contains a set of Unit-Groups corresponding to evolving populations with different fitness functions. And in a full-on self-modification OCP configuration, we will have Unit-Groups corresponding to whole OCP systems, each being studied and optimized by a global controller OCP contained in its own Unit-Group. This results in a maximum depth of three: So, we have, in the worst case that seems feasibly likely to be necessary for human-level AGI, a Unit-Group that is a set of sets of sets of simple AtomSpaces.
A little more formally, if we define
Complex-Dynamic-AtomSpace = ComplexAtomSpace x CIM-Dynamic-Set
then we see that the above discussion implies the definition
Unit = Complex-Dynamic-Atomspace Unit-Group = Union_k Unit^k
We may then define
Dynamic-Multipart-AtomSpace = Union_k (Unit-Group Union Unit)^k
A complex OCP configuration, at any given time, is a OpenCogPrime:ComplexAtomSpace: that is, a dynamic multipart AtomSpace. It will contain multiple Units, which host different configurations of CIM-Dynamics, and contain a subset of the Atoms in the OCP system.