Concentric Rings Architecture View
This document presents one way in which we could understand the OpenCog architecture. It is inspired by Jim Rutt's idea that we could think of the architecture as a set of concentric rings, with those further from the center being more application-oriented and more flexibly deployed, while those close to the center have stringent requirements.
We could think of the rings as:
- Ring 1: The Atomspace (of which there can be be many, in a single machine and/or distributed).
- Ring 2: Modules (We could call these cognitive processes, but that overloading with the Unix processes is confusing. Module isn't a perfect name given the existence of CogServer modules in the codebase, but it seems clear enough in this document) running on the same machine as their Atomspace. They could access the Atomspace by running on the same process (via a refactored CogServer) or via some inter-process coordination framework such as ZeroMQ messaging, shared memory, or a combination of both.
- Ring 3: Application processes which could be local or remote and communicate with the main Atomspace via messaging (sockets interface into the Scheme shell, ROS, REST, ZeroMQ, etc)
It's useful in some cases to split Ring 1 into Ring 0, which is the AtomTable itself, and Ring 1, which is the Atomspace project, including bindings, pattern matcher, etc.
It's maybe useful to split Ring 2 into two levels if we find modules that are not part of the Atomspace but must run in the same process or they'll be too slow.
It's perhaps useful to split Ring 3 into modules that need fast, intracluster access and those that could live with slower access and be deployed in far away machines accessing some remote cloud infrastructure.
For Ring 3, it's important to keep in mind what is the main Atomspace, as Ring 3 modules may have their own, local Atomspaces for performance purposes, but be dedicated to working with a main, remote Atomspace.
Ring 1: What is Inside the Atomspace?
At the most basic, the Atomspace is just a container for Atoms with a simple API for manipulating those Atoms. This means:
- The Atomtable (a hash set, or in C++ STL lingo: Unordered Set) and its internal indexes.
- A backing store for persistence.
- A type hierarchy of Atoms.
The Atomspace as currently conceived and developed, however, is far richer than that. Based on the current separation between the atomspace and opencog projects on GitHub, the Atomspace includes:
- Scheme (guile) bindings, along with a number of utilities for writing scheme code that interacts with the Atomspace (and the embedded interpreter with its persistent namespace, which is useful from Python as well).
- Python bindings, less complete and less stable than the Scheme ones.
- Soon, Haskell bindings.
- Among the many Atom types, special Atoms that are called FunctionLinks and are used to trigger dynamics inside the Atomspace (BindLink, SatisfactionLink, GetLink, PutLink, DefineLink, ExecutionOutputLink, etc)
- Specialized backing stores (to files, to Postgres, soon to Neo4j, in the past via ZeroMQ) which can also be used for distribution.
- The pattern matcher, which can be used to execute queries in the Atomspace and, via the use of FunctionLinks in those queries, trigger graph rewriting.
- The unified rule engine framework, including a backward chainer and a forward chainer.
We can assume that the AtomTable is our innermost Ring, Ring 0, if we want to make that distinction. The Atomspace class provides a simple API that doesn't go much further than Ring 0.
The bindings, Scheme utilities, pattern matcher, and unified rule engine provide a much richer API than Ring 0, and that is the API that people will most often use to write code that relies on an Atomspace for hypergraph storage. So it makes sense that this set of utilities be considered as part of Ring 1: while they aren't necessarily part of the conceptual Atomspace, it makes no sense for them to be run in separate processes, and the Atomspace becomes a lot less useful/friendly without these utilities.
The unified rule engine is generic in nature and should be considered as part of Ring 1 as well, as well as the basic chainers (given the granularity of the knowledge representation provided by the Atomspace, some chaining is useful in many scenarios). Specialized callbacks currently in the URE codebase may or may not belong on an outer ring (I don't know enough to have a strong opinion).
The diagram below depicts the components of Ring 1 (including the AtomTable and its CRUD-like API as a core Ring 0). Stars indicate components that provide part of the Atomspace's public "API", so to speak. This includes the thin wrapper around the AtomTable API as well as higher level constructs, such as the Pattern Matcher, which backs into FunctionLinks for many useful things.
What Else Might Be Ring 1?
One could argue for the REST API to be in Ring 1 (as another way to remotely access an Atomspace). On the other hand, one may make the decision that the Atomspace provides language bindings, but not networking bindings (so ROS and REST are both outside the Atomspace). I don't know enough to have a strong opinion.
There's ongoing discussion about whether the TimeServer and SpaceServer should exist as separate data structures (which should be integrated with the Atomspace) or as specialized linkage and code backing into the Atomspace (either the main Atomspace or specialized ones). Either way, they should be in Ring 1. Similar discussions periodically arise about the possibility of Atomspace plugins or property tables.
On the other hand, it's good that the CogServer is currently outside the Atomspace.
Outer Rings: Non-Atomspace Modules
There is a large list of modules that are outside the Atomspace, and we'd like to explore their natural, or most common places in this ring hierarchy (understanding that for some modules, there will be use cases that place them on different rings).
A hierarchy of rings could be built from one or more of these perspectives:
Cognitive architecture: more concrete, lower level modules vs abstract ones that build on top of the concrete ones. Runtime requirements: modules that require low latency and/or high throughput access vs those that live well with lesser performance. Code dependency: modules that work very tightly with the Atomspace (especially assorted Ring 1 utilities) vs modules that work with the Atomspace via the previous ones vs modules that don't even need the Atomspace.
I enumerate most of the modules below and describe where they fit under each perspective. A useful pattern emerges, the runtime perspective wins out, and it's easy to classify most modules once one takes this systematic approach. The diagram below depicts what gets (tentatively, preliminarily, and subject to correction by people who know more than me at the moment) into Ring 2 vs Ring 3.
In cases where extensive redesign is in progress, as with the Embodiment and OpenPSI modules, I describe what the module should be, rather than worry about the obsolete design or the soon to be obsolete code.
DeSTIN [Ring 3]
Right now, there is zero integration between DeSTIN and the Atomspace. DeSTIN would be Ring 3 from all perspectives.
It's useful to consider a future scenario in which DeSTIN internal node states are fed to the Atomspace during learning, and PLN is used to provide semantic feedback for training, by connecting these states with other, similar states, and/or by connecting them with other background knowledge about the instances being evaluated. This leads to two usage scenarios:
- Loose integration: data is exported by DeSTIN eventually and conclusions are fed back on demand, leading to a dynamic similar to "active learning", with PLN playing the user role (this is very similar to the planned MOSES-PLN integration). As with MOSES, DeSTIN would be level 2 from a cognitive architecture perspective, but level 3 from code and runtime perspectives.
- Tight integration: the above dynamics are constantly running, which would place DeSTIN squarely in level 2 from the runtime perspective.
I believe the near future will lead us to the loose integration scenario; I also note that it's possible to design similar integrations with other deep learning toolkits.
Dimensional Embedding [Ring 2]
It's hard to see this working without being in the same runtime process as the Atomspace being embedded (or creating a temporary Atomspace with the Atoms to be embedded). It seems to belong in Ring 2 from all perspectives.
ECAN [Ring 2]
My knowledge of ECAN is obsolete, and apparently the current design (if not the actual code just yet) allows for great flexibility in the resource allocation to ECAN. Still, it must run on the same machine and most likely on the same process as the Atomspace (assuming we get the multithreading to behave well enough). Ring 2 from runtime perspective, and strongly enough to make the other perspectives irrelevant.
Embodiment [Rings 2 AND 3]
Removing OpenPSI (discussed separately below) we're left with:
- ROS integration (not just the ability to pass messages through ROS, but which messages get sent to which nodes, and which messages are received from which nodes, and how their contents is handled) is Ring 2 from a cognitive perspective, but Ring 3 from a code dependency or runtime perspective.
- Perception is Ring 2 from all perspectives, though, given the huge volume of Atom creation and removal (There has been discussion about specialized perceptual Atomspaces with very simples ECAN for rapid forgetting)
- Management of action, which execution is unclear to me at the moment -- it could be designed in many ways, but it may be desirable to make it a thin layer on top of the rule engine and a knowledge store, which would make it Ring 2 from all perspectives.
Natural Language Understanding [Ring 3]
Right now this is a combination of:
- Link grammar, which is conceptually outside OpenCog.
- RelEx, Java code running on its own server and calling the link parser internally.
- RelEx2Logic, which has a Java component running in the RelEx server and a Scheme module running on top of the Atomspace (currently via CogServer)
We plan to replace the current Relex2Logic with unified rule engine rules driven by Scheme, so communication with the RelEx server would be the entry point into OpenCog. RelEx is then clearly Ring 3 from code and runtime perspectives. Relex2Logic is Ring 3 from a runtime perspective (it doesn't have strong performance requirements because Relex and the link parser aren't incredibly fast to start with) and is Ring 3 from a cognitive architecture perspective.
Natural Language Generation [Ring 2]
This starts with a search of things to say (done by various schemata controled by OpenPSI). The results of this search (a bunch of Atoms) are fed to the microplanner, which breaks them down into sentence-sized chunks corresponding to different kinds of sentence. These chunks are sent to SuReal, which translates the Atoms into sentences at a syntactic level via pattern matching and post-processing code.
From the cognitive architecture perspective these are Ring 3, but from runtime and code dependency perspective, they are all Ring 2. Note that the microplanning stage generates a large number of calls to SuReal, which we may execute via a distributed processing framework. Still, the SuReal queries would be run inside the same process as the Atomspace in each worker machine.
MOSES [Ring 3]
Right now MOSES is used most ofthen as an external tool. Nil is working on its long anticipated integration with PLN, where PLN will guide MOSES on feature selection (including via transfer learning), deme management, etc. Models are already exported into the Atomspace. Even with this integration, though, MOSES will sporadically export models into the Atomspace and sporadically get PLN guidance, which puts it on Ring 3.
On a distributed MOSES deployment one could conceive a master-workers setting where the master is on the same server as the Atomspace where PLN works on the MOSES models, if we have a lot of background knowledge there. Or we could have a peer setting where each machine has a smaller Atomspace, and PLN conclusions are shared among the machines. The choice is application driven, and the current bio application would use the former model with one big Atomspace (possibly distributed if it becomes large enough).
Pattern Miner [Ring 2]
This fits into ring 2 by all perspectives, and it actually uses its own, temporary Atomspace to work, so it needs to be in the same process as said Atomspace (but could be a separate process from the Atomspace from which data came from and conclusions go to; it wouldn't work well to have it on a separate machine). If we're mining large Atomspaces we'd probably have a distributed main Atomspace, with pattern miner processes running on each machine. This is similar to the distributed framework for language generation mentioned above.
OpenPSI [Rings 2 AND 3]
This is interesting, because OpenPSI encapsulates basic reactive decision making, urgent reactions to high salience perceptions or emotions, and cooler, long term cognitive decision making and action execution management. In fact, the MindCloud idea is that OpenPSI could be split between Rings 2 and 3 from the runtime perspective, with some cognitive ability provided inside the agent or app that controls it, and deeper cognition via the cloud.
Planner [Ring 2]
I'm talking about the envisioned port of the planner to the unified rule engine. In that case, it's still Ring 3 from a cognitive architecture perspective, but Ring 2 from a runtime perspective, given how much it will need to interact with the Atomspace. The code could conceivably be implemented in a way that's not heavily dependent on the Atomspace, but this seems unlikely. The current plan is to run planning as a specialized PLN backward chaining task, with special weighting for rule choice and some temporal ordering constraints. This may be post-processed by a scheduling algorithm to optimize the plan for execution.
PLN [Ring 2]
Fundamental tool, and consisting only of the inference rules at the moment (since the URE is part of the Atomspace). As always, inference control besides the basic backward and forward chainers is to be tuned. Some specialized callbacks from the pattern matcher are used to guide rule application but these are currently inside the Atomspace (because they might one day in principle be generic). Anyway, it's on Ring 2 from all perspectives. It could use its own temporary Atomspace for large inference tasks, in a way similar to the pattern miner.
System-Level Modules [All over the place]
These modules are all Ring 1 from a cognitive architecture perspective, which isn't the important one.
The CogServer or whatever takes on its role of providing a runtime with an Atomspace and other dynamics running on the same process is Ring 2 by definition.
As mentioned above, it's unclear why the REST API, TimeServer and SpaceServer aren't considered part of the Atomspace now.
The REST API is an interesting one. It's either Ring 1 or Ring 3. If we decide the Atomspace doesn't provide a REST API, it's Ring 3, as it doesn't need to be in the same machine as the Atomspace and the Python shell it communicates with. This places it in the same level as visualization applications, other ROS nodes, etc. If we decide the Atomspace should have a REST API, then it's Ring 1.
Speaking of it, ROS is Ring 3 from code and runtime perspectives (and can be deployed as Ring 2, of course).
The TimeServer and SpaceServer don't make sense as currently implemented. It's possible to think of them as being on a similar conceptual level as dimensional embedding: they provide a special purpose view of data in the Atomspace. But this seems artificial to me, and right now I believe their place is on Ring 1, however they end up being implemented.
There is a large number of visualization tools, most probably broken or useless. ocViewer seems to be mostly functional, mostly useful one. They are all Ring 3 in my view.
There are other modules on GitHub, but they weren't included for a number of reasons. Some are obsolete: I have purposefully ignored the Python implementation of PLN (obsoleted by the URE-based PLN), Fishgram (obsoleted by the current, soon to be obsoleted in turn, planner), and a multitude of visualization tools.
I also didn't include most of the NLP tools (some of which are obsolete, some others are research projects paused mid-way, etc), primarily because I don't know enough about any of them to make a useful classification.
If I missed something of importance, let me know.
Putting it All Together
Finally, here's a diagram puts all of the rings in the same picture with some detail.