This document is based on recent discussions with Ben and Linas about the MindCloud and, relatedly, coordination of multiple processes running around the same Atomspace (see Concentric_Rings_Architecture_View). During those discussions, it became clear that there is a functional way to coordinate multiple processes using the same master Atomspace, which relies on the PostgreSQL backing store to act as the master Atomspace (and on those processes including code to read and write to the master Atomspace).
While this idea isn't optimal from a performance or scalability standpoint, it is simple and relies primarily on existing code, so it could be rapidly developed into a prototype. Even if we discard this prototype, it would be useful as a learning experience and as a baseline against which more sophisticated architectures should be compared.
Baseline Distributed and Parallel Processing Vision
This is a very simple architecture that could be prototyped and tested without too much coding work.
We would have a process (in the Unix sense) that is responsible for "serving" a mind. The MindCloud would operate on a one-process-per-mind basis, with multiple processes per machine.
We'd use some sort of load balancing and routing framework to direct client requests to one machine in a cluster of MindCloud servers. We need to evaluate existing options for this, but I think it's very likely that we'd find something usable among the many open source libraries in this domain.
We would have a single master Atomspace. It is the existing PostgresSQL backing store. All minds in one MindCloud are saved to this one DB, and when a mind needs to be loaded and run, we load it from the DB. This DB can run on a single, fatter server, or it can be distributed (there are multiple ways to distribute PostgreSQL databases; I don't know which, if any, have been tested with the Atomspace use cases).
This master Atomspace contains individual minds' knowledge, as well as shared knowledge (abstracted experiences, common background knowledge, etc).
Running a Single Mind
Each MindCloud server would have a coordinator process whose job is to spawn single-mind-controlling processes and, perhaps monitor them so zombies and runaway consumers of CPU/RAM can be killed.
The process that controls a single mind is either the CogServer or whatever replaces/extends it. It contains its own in-RAM Atomspace object and a collection of cognitive processes. As this is all determined by a configuration file, it's easy for the same cluster to serve different "species" of Mind, such as Hanson robots and Telehealth smart pill dispensers. It may not be desirable to do so, however, for load balancing and performance optimization reasons.
When this process is started, the Atomspace object is populated with the results of a query to the PostgreSQL backing store. We would need to devise the knowledge representation for this, in order to ensure that we could query the PostgresSQL DB for the Atoms relevant to an individual mind (and, optionally but very likely, more general background knowledge and shared knowledge relevant to an entire "species" of minds.
These processes, once loaded up with their Atomspace, function as autonomous "minds" in the MindCloud. They will create and remove Atoms as needed, according to the cognitive dynamics appropriate to that kind of mind. At some point the RAM Atomspace should be synced back to the PostgreSQL DB. How this syncing happens is probably going to be application-specific, depending on various factors. Some simple initial options are:
- Sync to disk whenever a mind is going to be removed from RAM (its process will be terminated, either in response to a client call to the MindCloud that terminates a session, or because it's been idle without client interaction longer than a timeout, etc).
- Sync to disk at fixed intervals (we could spawn a thread inside the process running that mind, which wakes up every X minutes and saves the Atomspace)
- Sync to disk in response to specific events (different applications could have triggers that determine when something has just happened that is important enough to merit saving a snapshot of the Atomspace).
In all cases there's a choice to be made of what is to be saved. By default this is just ECAN's decision based on the AttentionValue of each Atom.
I can immediately see three limitations for this baseline system. From simpler to more complicated, they are: scalability, cognitive architecture complexity/intelligence, and knowledge sharing.
The first limitation of this architecture is scalability. This setup will scale as far as PostgreSQL can scale. There are a couple of obvious ways to scale beyond a single PostgreSQL server's abilities: segregation by "species"; replication, so we have multiple sub-populations, each population sharing knowledge with the other minds backed into the same DB server.
There is also the possibility of replacing PostgreSQL with some of the much more scalable noSQL alternatives. One would need to carefully understand their performance and scalability trade-offs in order to select the alternative(s) that best match our requirements in terms of data volume, read/write patterns, knowledge representation, etc.
One advantage of this baseline architecture, however, is that it makes it relatively painless (as far as these things go) to switch backing stores for scalability and/or performance, as our needs grow.
Another limitation may be related to intelligence, or cognitive architecture. We're looking at a model in which all the cognitive processes for a given mind run in a single process. Right now this would include the issues caused by the CogServer's round robin scheduling of MindAgents. Getting around this would require adding multithreading to the CogServer (or replacing it with something that does the same thing in a multithreaded way, if that's easier than refactoring the existing code) or running minds on multiple, coordinated processes. The former is better from a system administration and performance tuning point of view, but also tricky to program.
Finally, there is a semantic limitation inherent to the PostgreSQL backing store as a master Atomspace: it does no reconciliation of conflicts. If multiple processes change the same Atom, the last writer will win. This limitation is perhaps less severe in the MindCloud scenarion than in a regular, distributed Atomspace scenario (where the multiple machines are all part of the same AI), because most changes to Atoms are limited to a single mind's knowledge.
The changes made to the shared knowledge still have to be addressed, of course, and in the long run this requires revision and reconciliation. But for initial applications, perhaps it's enough to have specialized processes dedicated to that, and then either:
- Have mind-controlling processes send over their proposed changes to shared knowledge to these specialized processes, whose job is to handle conflicts and revisions appropriately.
- Have mind-controlling processes send over their experience to these specialized processes, and leave the job of determining how that experience should impact the shared knowledge to the specialized process, not the mind-controlling process.
- Some combination of both.
It would probably make sense to prototype this baseline system, for one or both initial MindCloud applications. This requires documenting their requirements (we'd like to do that through semi-informal use case analysis) and implementing cognitive dynamics to meet those requirements (when easy) or simulate doing that in a way that's approximately as costly as the real thing (when the real thing isn't straightforward).
Even if we end up simulating all the cognitive dynamics for each application, this would let us do some performance and scalability analysis (and we should do so while building reusable code to repeat any experiments as the simulated dynamics are replaced with realistic ones).