Goals of the Proposed Research
The proposed research is viewed as constituting the early stages of a longer-term project aimed at the creation of a human-level AGI (artificial general intelligence) system — one sharing the human mind's capability for commonsense inference and everyday general intelligence, but also going beyond human capabilities in various ways enabled by its digital computer infrastructure and highly precise algorithms. The scope of applications of human-level AGI systems is obviously tremendous, pervading all aspects of science, engineering, commerce and humanities. Among other applications, the creation of artificial scientists and engineers would vastly accelerate technological progress along multiple avenues; and, the creation of AGIs that are themselves students of AGI holds the possibility of a massive acceleration of artificial intelligence, as AGIs get smarter and smarter at making themselves smarter and smarter.
To get to these wonderful longer-term goals, however, we must proceed via incremental steps, each of which provides coherent, easily testable and demonstrable functionality of its own. This proposal describes two incremental steps that constitute significant progress along the path toward AGI with capability at human level and beyond, but also possess significant interest and value in their own right. The composite goal of these two steps (Phase 1 and Phase 2) is to create an AGI-controlled virtual agent in an online virtual world, that can carry out various tasks in the world and hold simple conversations regarding its world and its interactions with the objects and agents in it. This AGI is supposed to demonstrate human-like commonsense knowledge, at the rough level of a young child, within the domain of its virtual world (a world which, while lacking the richness of the real physical world, is vastly richer than the various toy domains typically used by AI researchers; and we have put considerable thought into how to create a virtual world sufficiently rich and robust to support the various world-interactions present in, for example, a preschool or a child psychological testing lab).
The goal of Phase 1 of the proposed research project is to create an artificial general intelligence (AGI) system capable of controlling a humanoid agent in a 3D simulation world, in such a way that this agent can carry out a simple English conversation about its interactions with its simulated environment. This will be done by building on the current OpenCog and RelEx software systems, guided by the OpenCogPrime software design.
A simple example conversation of the sort envisioned at the end of Phase 1 would be as follows:
(i.e. human- controlled avatar)
|What do you see on the table?|
(i.e. OCP-controlled avatar)
|A red cube, a blue ball, and a lot of small balls|
|Human||Which is bigger, the red cube or the blue ball?|
|OpenCog||The blue ball|
|Human||Bring me one of the small balls|
|OpenCog|| (walks to table, picks up ball, and brings it to human-controlled avatar)
Here it is
|Human|| (takes the ball to another table, where there are three cups turned upside down, one red, one green, and one blue)
I'm going to put the ball under the red cup
(does so, and then shifts the cups around a bit to various different positions)
Where is the ball?
|OpenCog||Under the red cup|
(additional agent, could be human or AI controlled)
|Human||OpenCog, if I ask Bob which cup the ball is under, what will he say?|
|OpenCog||I don't know|
|OpenCog||He didn't see you put it under the cup.|
This sort of dialogue seems extremely simple by the standard of ordinary human life, but at present there is no AI system that displays remotely this level of commonsensically savvy conversation. Yet, all the ingredients needed to support this are present in the OpenCog and RelEx software systems the PI and his colleagues have created over the last 7 years — now it's "just" a matter of doing some extending, fine-tuning and integrating.
We stress that our goal is not to achieve a few specific examples of this sort of dialogue that can be showcased in a research paper or conference presentation. This has been done before, and could be done using OpenCog with far less effort than the project proposed here. Our goal is to achieve this level of virtual-world-related conversation in a truly robust and flexible way, via using an architecture that is specifically built with more ambitious goals in mind. We also note that, while Phase 1 is an important intermediary step, it is Phase 2 that will demonstrate functionality that dramatically and excitingly goes beyond anything previously obtained in the AI field. What is important in Phase 1 is not so much the behavior achieved, as the fact that the behavior achieved is being achieved using a flexible, growable general intelligence architecture that can be improved to yield the Phase 2 behaviors, and then improved far beyond that level.
Phase 2 will extend the Phase 1 behaviors significantly, via enabling the agent to carry out complex, interactive, temporally sequenced "physical" tasks in the virtual environment, which will also extend its linguistic capabilities due to providing critical additional experiential grounding for its language abilitites.
A simple example of the kind of activity envisioned as possible at the end of Phase 2 is as follows:
||Look over there on the other side of the table. What do you see?|
||Bill and Bob are there.|
|Human||What are they doing?|
|OpenCog||Bill is throwing a red ball to Bob. Bob is throwing a red ball to Bill.|
|Human||Use these blocks on the table to show me what they're doing.|
|OpenCog|| (at the table, stacks up several red blocks to make a "simulated Bob", and several blue blocks to make a "simulated Bill", and a small green ball to make a "simulated ball" ).
|Human|| (Pointing at the stack of red blocks)
What's that supposed to be?
|OpenCog|| (Points to Bill)
|Human||Very funny. I meant: which blocks are Bill?|
|OpenCog|| (Pointing at the stack of blue blocks)
These are Bill
|Human||Can you use the blocks to show me what Bill and Bob are doing?|
|OpenCog|| (Takes the green ball and moves it back and forth in the air between the Bob-blocks and the Bill-blocks, to illustrate the game of catch that Bob and Bill are playing)
They're throwing the ball to each other like this.
Note the difference here from Phase 1: there's more than just cognizant conversation, locomotion and simple object manipulation going on, there's systematic, purposeful, planned manipulation of objects in an interactive social context.
Again, we stress that our goal is not to achieve a handful of examples of this sort of conversation, via carefully customizing or tuning our AI system in an example-specific way. Rather, we seek to achieve this level of functionality in a robust and flexible way. It is important that we are achieving this functionality using a general intelligence architecture created with further expansion in mind. However, further progress aside, we believe that the Phase 2 functionality will be considered sufficiently exciting by the AI community to be taken as a demonstration of dramatic achievement, and of the unprecedented promise of the OpenCogPrime approach to AGI.
The ordering of these two phases may seem peculiar as compared to human cognitive development, but this appears the most natural order in which to proceed given the specific technology bases available to drive development at present.
The specification of precise intelligence tests suitable for measurement of the system's abilities during each phase is part of the research project itself; but clear guidelines for the creation of such tests have been formulated in recent publications by the PI.
Deliverables Associated with the Proposed Research
All results, "prototypes" or systems that are developed under this proposal will be made publicly available: in the case of software via appropriate open-source licenses; and in the case of research discoveries, via publication in professional conferences or journals, or posting online as technical reports.
The primary deliverables associated with this project will be:
- A C++ software system implementing the above functionalities in an extensible way, and connected to a virtual-world system (most likely a flavor of OpenSim, such as RealXTend), so as to control a humanoid agent in that virtual-world. This humanoid agent will hold simple English conversations regarding the objects and agents in its immediate surroundings; and and the end of Phase 2, will carry out complex, interactive sequenced activities in its environment.
- Detailed documentation of the software created
- Detailed records of the experiments conducted with the software in the virtual world
- A set of tests and metrics for evaluation of the intelligence of AI-controlled agents in virtual worlds. These tests will be described in written form, and in some cases will also be embodied in appropriate objects and environments existing in the virtual world.
- Publication of detailed designs, tests and results as online technical reports and journal and conference papers
The conceptual foundation of the proposed work is OpenCogPrime (OCP), a thorough, systematic software design for artificial general intelligence (AGI), initially created as an open-source version of the proprietary Novamente Cognition Engine (NCE) design. The NCE design (described e.g. in Goertzel, 2006; Goertzel et al, 2004) was developed and prototyped during the period 2001-2008, inspired by earlier work on the related Webmind AI Engine software system (Goertzel, 2002).
The OpenCogPrime design has been created and proposed together with the hypothesis that, if it is fully fleshed-out, implemented, tested, tuned and taught, it will lead to a software system with intelligence at the human level and ultimately beyond.
The completion of the implementation of OCP is a matter of research as well as software implementation: there is still significant scientific discovery and experimentation to be done in fleshing out the details of the system. Some aspects have been implemented in production-grade software, some in prototype software, and others remain unimplemented and have have been fleshed out in detail only mathematically and software-design-wise. There is still real work to be done. But all components of the design have been analyzed and refined carefully over a period of years, and most importantly, the holistic nature of the design has been carefully worked out so that the various components may appropriately interact to give rise to the necessary emergent structures and dynamics of intelligence
Put briefly, the key reasons why we believe the project has a strong chance of success, are as follows:
- OCP is based on a well-reasoned, comprehensive theory of mind, conceptually outlined in (Goertzel, 2006), which dictates a unified approach to the five key aspects that must be addressed in any AGI system:
- knowledge representation
- learning, reasoning and creativity, collectively grouped as knowledge creation
- cognitive architecture
- embodied, experiential learning
- emergent structures and dynamics
- The specific algorithms and data structures chosen to implement this theory of mind are efficient, robust and scalable and, so is the OpenCog software implementation
- Virtual world technology provides a powerful arena for instructing and experimenting with AGI systems
- The open source methodology allows a considerable amount of global brainpower and software engineering effort to be brought to bear on the system. The core team to be funded by this proposal will be focused on the "hardest parts" of completing the OCP project, but will benefit from the additional efforts of a community of open-source developers, who will help with debugging the system, teaching it, and making numerous valuable software improvements
Along with the OCP design, the other key technical and scientific foundations of the project are as follows:
- OpenCog, an open-source AI software system consisting largely of software code created by the PI and his colleagues at Novamente LLC during the period 2001-2008 as part of its Novamente Cognition Engine project, and which constitutes a solidly-engineered foundation for advanced, virtually-embodied AI systems, compatible with the OpenCogPrime AGI design. Key components include the AtomSpace weighted labeled hypergraph (node and link) knowledge representation, the PLN (probabilistic logic networks) reasoning engine, the MOSES probabilistic evolutionary procedure learning engine, and the ECAN system for economics-based attention allocation.
- RelEx, an open-source NLP (natural language processing) engine, created by the PI and his colleagues and incorporating a modified version of the Carnegie-Mellon link parser along with other customized tools, and integrated with OpenCog. RelEx also contains an offshoot called NLGen, which carries out language generation (complementing language comprehension which is RelEx's main function)
- OpenSim, an open-source virtual world, to which OpenCog has a proxy allowing the control of AI-controlled agents in OpenSim (which so far has been used mostly for controlling virtual pets). OpenSim is highly customizable and part of the proposed work involves customizing OpenSim to have greater flexibility in terms of agent motor control and object dynamics
The research proposed here would involve implementation of only a portion for the OCP design; the goal has been to choose a fraction that, in itself, will lead to some scientifically interesting and impressively demonstrable functionality, both at the end of Phase 1 and then even more so at the end of Phase 2. Demonstrability is emphasized both because of its scientific importance, and because it may be valuable to use the results of this project as a widely disseminated showcase for the power of the OCP design so as to help achieve greater funding for the deployment of a larger team to implement the remainder of the design.
The key design aspects and high-level roadmap for the OCP project have been articulated in the initial sections of an online text called the OpenCogPrime Wikibook, available online at http://opencog.org/wiki/OpenCog_Prime. Relevant material is also found in the two books:
- The Hidden Pattern (Goertzel, 2006), which stresses conceptual and philosophical foundations
- Probabilistic Logic Networks (Goertzel et al, 2008), which presents a probabilistic logic system (PLN) that has been integrated into OpenCog and serves a key role in OpenCogPrime
Key Design Aspects of OpenCogPrime
Figure 4 gives a high-level overview of the key cognitive dynamics in the OpenCogPrime design, most of which are currently implemented in prototype form, and some of which are implemented and tested more thoroughly.
Figure 4. High-level overview of the cognitive dynamics in the OpenCogPrime design. For detailed explanations the reader is directed to the references. The overall operation of the system involved the pursuit of a set of goals provided by the programmer, which are then refined by inference. At each time the system chooses a set of procedures to execute, based on its judgments regarding which procedures will best help it achieve its goals in the current context. These procedures may involve external actions (e.g. involving conversation, or controlling an agent in a simulated world) and/or internal cognitive actions. In order to make these judgments it must effectively manage declarative, procedural, episodic, sensory and attentional memory, each of which is associated with specific algorithms and structures as depicted in the diagram. There are also global processes spanning all the forms of memory, including the allocation of attention to different memory items and cognitive processes, and the identification and reification of system-wide activity patterns. The RelEx language engine is currently separate from the main OpenCog framework, feeding the latter with input based on its language processing algorithms; but Phase 1 of this proposal involves porting RelEx into OpenCog's PLN-based declarative memory management component.
As may be seen in Figure 4, the key cognitive algorithms of OCP are:
- Probabilistic Logic Networks (PLN), a logical inference framework capable of uncertain reasoning about abstract knowledge, everyday commonsense knowledge, and low-level perceptual and motor knowledge (Goertzel et al, 2008)
- MOSES, a probabilistic evolutionary learning algorithm, which learns procedures (represented as LISP-like program trees) based on specifications (Looks, 2006)
- Economic Attention Networks (ECAN), a framework for allocating (memory and processor) attention among items of knowledge and cognitive processes, utilizing a synthesis of ideas from neural networks and artificial economics
- Map formation, the process of scanning the knowledge base of the system for patterns and then embodying these patterns explicitly as new knowledge items
- Concept creation, the process of forming new concepts via blending and otherwise merging existing ones
- Simulation, the running of simulations of (remembered or imagined) external-world scenarios in an internal world-simulation engine
- Goal refinement, that transforms given goals into sets of subgoals
and there are also other supporting cognitive algorithms. Each of these cognitive algorithms deals with one or more types of memory: declarative, procedural, sensory, episodic or attentional. Declarative and attentional memory are handled in a structure called the AtomTable which is a special form of weighted, labeled hypergraph (i.e. a table of nodes and links with different types and with multiple weights giving probabilistic-truth-values, as well as space and time resource related attentional information). Procedural memory is handled using special "Combo" tree structures embodying LISP-like programs, in a special program dialect intended to manage behaviors in a virtual world and actions in the AtomTable. Sensory memory is handled via specialized sense-modality-specific data structures, and episodic memory is dealt via an internal simulation world that allows the system to run "mind's eye movies" of situations it remembers, has heard about, or hypothetically envisions.
The overall coordinated activity of these algorithms, in the service of OpenCogPrime's general intelligence and specifically its conversational and virtual-world activity, may be understood via the "cognitive schematic"
interpreted to mean "If the context C appears to hold currently, then if I enact the procedure P, I can expect to achieve the goal G with certainty p." The system is initially supplied with a set of goals such as "get rewarded by my teacher", "learn new things" and so forth; and it then uses PLN inference (guided by other cognitive mechanisms) to refine these initial goals into more specialized subgoals. We use the term "cybernetic knowledge" to refer to the system's knowledge of its goals and subgoals.
A procedure in this schematic is a Combo tree stored in the system's procedural knowledge base; and a context is a (fuzzy, probabilistic) logical predicate stored in the AtomTable, that holds, to a certain extent, during each interval of time. A goal is simply some fuzzy logical predicate that has a certain value at each interval of time, as well. In the following will also use the shorthand
This formalization leads to a conceptualization of the internal action of an OpenCogPrime system as involving two "key learning processes":
- Estimating the probability p of a posited C & P → G relationship
- Filling in one or two of the variables in the cognitive schematic, given assumptions regarding the remaining variables, and directed by the goal of maximizing the probability of the cognitive schematic
... or, to put it less technically:
- Evaluating conjectured relationships between procedures, contexts and goals ("analysis")
- Conceiving novel possible relationships between procedures, contexts and goals ("synthesis")
Given this conceptualization, we can see that, where synthesis is concerned,
- Procedural knowledge, and procedural learning methods like MOSES, can be useful for choosing P, given fixed C and G. Simulation may also be useful, via creating a simulation embodying C and seeing which P lead to the simulated achievement of G
- Declarative knowledge, and associated methods like PLN, can be useful for choosing C, given fixed P and G (also incorporating sensory and episodic knowledge as useful). Simulation may also be used for this purpose.
- PLN, acting on declarative knowledge, can be useful for choosing G, given fixed P and C. Simulation may also be used for this purpose.
- Goal refinement is used to create new subgoals G to sit on the right hand side of instances of the cognitive schematic
- Concept formation and map formation are useful for choosing G and for fueling goal refinement, but especially for choosing C (via providing new candidates for C). They can also be useful for choosing P, via a process called "predicate schematization" that turns logical predicates (declarative knowledge) into procedures.
On the other hand, where analysis is concerned:
- PLN, acting on declarative knowledge, can be useful for estimating the probability of the implication in the schematic equation, given fixed C, P and G. Episodic knowledge can also be useful in this regard, via enabling estimation of the probability via simple similarity matching against past experience. Simulation may also be used: multiple simulations may be run, and statistics may be captured therefrom.
- Procedural knowledge, mapped into declarative knowledge and reasoned on by PLN, can be useful for estimating the probability of the implication C & P → G, in cases where the probability of C & P1 → G is known for some P1 related to P
- PLN, acting on declarative or sensory knowledge, can be useful for estimating the probability of the implication C & P → G, in cases where the probability of C1 & P → G is known for some C1 related to C; and similarly for estimating the probability of the implication C & P → G, in cases where the probability of C & P → G1 is known for some G1 related to G
- Map formation and concept creation can be useful indirectly in calculating these probability estimates, via providing new concepts that can be used to make useful inference trails more compact and hence easier to construct
The following table sums up the role of the different OCP cognitive mechanisms in terms of analysis and synthesis:
||Evolutionary, blending-based and logical concept creation|
(Declarative & Procedural)
|PLN forward inference||PLN backward inference|
(Declarative & Procedural)
|MOSES and hillclimbing procedure learning (combining portions and features of prior procedures)||Probabilistic modeling to identify patterns among programs fulfilling a certain goal in a certain context (part of MOSES)|
||Imagination of hypothetical episodes based on specified criteria, via combination of aspects of known episodes||Filling in gaps in remembered or hypothesized episodes|
|| Hebbian learning
|Assignment of credit|
||Goal synthesis||Goal refinement|
||Logical concept creation||Evolutionary and blending based concept creation|
(Declarative & Procedural)
|PLN forward and backward inference, for simple problems||PLN forward and backward inference|
(Declarative & Procedural)
|MOSES procedure learning and pattern mining, for simple problems||MOSES procedure learning and pattern mining|
|| Filling in small gaps in remembered or hypothesized sensory scenes or episodes
Imagination of hypothetical scenes or episodes, highly similar to prior ones
| Filling in large gaps in remembered or hypothesized scenes or episodes
Imagination of hypothetical scenes or episodes, very different from prior ones
|| Assignment of credit, in simple cases
| Assignment of credit
||Goal synthesis & refinement, in simple cases||Goal synthesis & refinement|
We can also see the key role of interaction between different cognitive mechanisms here, because sometimes the best way to handle the schematic equation will be to fix only one of the terms. For instance, if we fix G, sometimes the best approach will be to collectively learn C and P. This requires either a procedure learning method that works interactively with a declarative-knowledge-focused concept learning or reasoning method; or a declarative learning method that works interactively with a procedure learning method.
Interactions like this are actually key to the theory underlying OpenCogPrime. One of the key ideas underlying the system is that most AI algorithms suffer from combinatorial explosions: the number of possible elements to be combined in a synthesis or analysis is just too great, and the algorithms are unable to filter through all the possibilities, given the lack of intrinsic constraint that comes along with a "general intelligence" context (as opposed to a narrow-AI problem like chess-playing, where the context is constrained and hence restricts the scope of possible combinations that needs to be considered). In the OpenCogPrime design, the different learning mechanisms are supposed to interact in such a way as to palliate each others' combinatorial explosions — so that, for instance, each learning mechanism dealing with a certain sort of knowledge, must synergize with learning mechanisms dealing with the other sorts of knowledge, in a way that decreases the severity of combinatorial explosion. The Appendix to this proposal gives a table describing the interactions between the different learning mechanisms in the OpenCogPrime design, which may elucidate this notion a little further.
The final fact the OCP design needs to account for is that, in any real-world context, a system will be presented with a huge number of possibly relevant analysis and synthesis problems. Choosing which ones to explore is a difficult cognitive problem in itself — a problem that also takes the form of the cognitive schematic, but where the procedures are internal rather than external. Thus this problem may be addressed via the analysis and synthesis methods describe above. This is the role of attentional knowledge, which is handled by the ECAN artificial economics mechanism, that continually updates ShortTermImportance and LongTermImportance values associated with each item in the system's memory, which control the amount of attention other cognitive mechanisms pay to the item, and how much motive the system has to keep the item in memory. ECAN has deep interactions with other cognitive mechanisms as well, which are essential to its efficient operation; for instance, PLN inference may be used to help ECAN extrapolate conclusions about what is worth paying attention to, and MOSES may be used to recognize subtle attentional patterns. ECAN also handles "assignment of credit", the figuring-out of the causes of an instance of successful goal-achievement, drawing on PLN and MOSES as needed when the causal inference involved here becomes difficult.
Experimentation and Evaluation Environment
Another very important aspect of the practical development of OCP is the question of "evaluation and metrics" — or in less formal terms: AGI IQ testing. How do we measure incremental progress, as we move toward our goals?
This is a complex and subtle issue, and we have crafted an approach which we call the "AGI Preschool Test", based on the theory of multiple intelligences and current practices in human early childhood education. These ideas on testing are articulated in a recent paper by the PI entitled "AGI Preschool," to be presented at the AGI-09 conference in Washington DC in March 2009. In essence, what is proposed there is the creation within a multiuser online virtual world such as OpenSim, of an environment similar to preschools used in early human childhood education. What is required for the present project is the creation of a testing environment based on a subset of the ideas in that paper.
An AGI Preschool, in general, consists of a set of "learning centers" where young AGI systems can interact with objects in the virtual world, one another, and humanoid agents. Typical learning centers focus on reading, writing, science, math, manipulatives and dramatics. The currently proposed work would require only a manipulatives learning center, consisting of blocks, balls and other objects situated on a few tables, which AGIs and human-controlled agents can gather around. Depending on how Phase 2 progresses, we might also introduce science and/or dramatics learning centers at that stage.
The basic philosophy underlying this approach to development and evaluation is that tasks with an everyday-human-world flavor are more appropriate than abstracted mathematical, puzzle-like or data-analysis-type tests or tasks, for assessing and working toward human-like general intelligence. We believe that real-world tasks have a subtlety of interconnectedness and developmental course that is not captured in current mathematical learning frameworks nor standard AI test problems. To put it mathematically, we suggest that the universe of real-world human tasks has a host of "special statistical properties" that have implications regarding what sorts of AI programs will be most suitable; and that, while exploring and formalizing the nature of these statistical properties is important, an easier and more reliable approach to AGI testing is to create a testing environment that embodies these properties implicitly, via its being an emulation of the cognitively meaningful aspects of the real-world human learning environment. Thus we feel the "partial AGI Preschool" approach proposed here is the most appropriate one.
However, we don't consider it necessary to posit specific quantitative evaluation metrics for the system, within the proposed test environment. To do so for the target functionality (intelligent conversation about the contents of the virtual world) would be awkward and artificial, and would also tend to encourage "overfitting" of the system to the specific quantitative test criteria. Rather, we believe the best approach is to create a robust test and evaluation environment, with rich instrumentation for logging and manipulating the system and its environment, and then perform careful qualitative evaluation of the system's performance at each stage.
Brief Statement of Work
Phase 1 will result in an AI system that controls a humanoid agent living in a customized version of the OpenSim virtual world, and carries out simple English conversations about the objects in the world around it, and its interactions therewith. The agent will interact with objects via grasping, pointing, carrying and executing other simple actions, as well as communicating about what it sees and what it and other agents are doing.
The specific work proposed in Phase 1, toward this goal, falls into the following categories:
- Language processing: full integration of the RelEx and NLGen language comprehension and generation algorithms into the OpenCog framework, thus allowing sophisticated cognitive guidance of linguistic processing, and enabling more intelligent language functionality than permitted within RelEx and NLGen currently
- Inference: completion of the implementation of the PLN (Probabilistic Logic Networks) inference framework within OpenCog, and customization and tuning of PLN for inference on the output of RelEx
- Attention allocation: completion and implementation of the ECAN economic attention allocation framework within OpenCog
- Dialogue management: implementation within OpenCog of a controller process for managing conversational dialogue, drawing on linguistic and inferential functionality
- Scalability: improvement of the OpenCog infrastructure to support distributed processing, and real-time responsive scheduling
- Virtual embodiment: extension of OpenCog's current virtual embodiment framework, which has been customized for simplistic virtual pets, to handle more flexibly controllable virtual humanoid agents, via integrating an open-source robot simulator with the OpenSim virtual world
Phase 2 will extend the Phase 1 system to create an embodied agent that carries out complex interactive tasks in the virtual world, communicating about what it does, but also doing systematic planning and reasoning about its actions and communications. The specific technical work proposed in Phase 2 falls into the following categories:
- Language processing: deep integration of PLN inference into the OpenCog language processing framework, allowing parse selection, word sense disambiguation and reference resolution to be carried out in a contextually sensitive way using inferential judgment
- Inference: implementation, testing and tuning of spatiotemporal inference using PLN, allowing reasoning about complex sequences of actions (as coordinated with linguistic knowledge)
- Procedure learning: extension of procedure learning code to allow for
- Attention allocation: integration of attention allocation with action selection and procedure execution, allowing the real-time execution of complex, learned embodied procedures
- Concept formation: implementation of heuristics for the formation of novel concepts, based on combining prior concepts and on recognizing patterns in the current network of knowledge
- Integrative Intelligence: tuning of the above cognitive procedures to work together effectively, enhancing each others intelligence
- Dialogue management: adaptation of conversational patterns using procedure learning based on MOSES, PLN and attention allocation
- Scalability: additional of specialized distributed processing techniques for allowing rapid distributed PLN inference and MOSES procedure learning
- Virtual embodiment: implementation of "bead physics" into the OpenSim virutal world, thus allowing the creation of complex textures, masses, strings and other entities alongside the typical virtual-world objects; implementation of a more flexible body for the AI using bead physics as well
The proposed team members have been working together on related projects for a substantial period of time already (6 of the team members have worked together1998, and in some cases the collaborative relationships go back even further). Bios of team members are given at the end of this proposal.
The spreadsheet OpenCogProposalWorkBreakdown.xls associated with this documents presents a rough work plan, explaining what the specific team members will be working on during the course of the project. Each of the two phases of the project is broken down into three iterations. The Phase 1 work breakdown is given at a fair level of detail; the Phase 2 breakdown is given at a higher level of abstraction, but further details can be supplied upon request.
We present here development milestones for each of the 6 month iterations within each of the two proposed phases. These are important so that external observers can assess the progress that has been made, and also so that the development team has concrete goals to work towards.
|Phase 1, Iteration 1||Simple conversations, involving declarations, questions and answers regarding relative locations of objects and timing of events, at the manipulatives center|
|Phase 1, Iteration 2||Simple conversations, involving declarations, questions and answers regarding the states of knowledge that self and others have about locations of objects and timing of events, at the manipulatives center|
|Phase 1, Iteration 3||More flexible conversations (about relative locations of objects and timing of events, and agents' states of knowledge thereof), involving multiple rhetorical modes such as: explanation, argumentation, commanding, obeying, answering, chatting. (This is exemplified by the Phase 1 example dialogue given above.)|
|Phase 2, Iteration 1||Conversations centering on following instructions regarding manipulations to be carried out involving objects and/or agents; including asking for clarification regarding instructions not initially well understood|
|Phase 2, Iteration 2||Conversations in which a human gives the AI general goals or vague instructions regarding what manipulations to carry out involving objects, and the AI has to figure out the details interactively|
|Phase 2, Iteration 3||Conversations in which a human gives the AI general goals or vague instructions regarding what manipulations to carry out involving objects and other agents, and the AI has to figure out the details interactively (this is exemplified by the Phase 2 example dialogue given above).|
Comparison with Other Ongoing Research
A moderate number of attempts to create human-level AGI have been made in the past and met with disappointing results; the reader will thus be forgiven a certain level of skepticism regarding the prospects of the present proposal.
However, there are many reasons to believe things can be different this time around. From the present point of view, the ideas and designs of that era embody a considerable conceptual naivety, and were to some extent overadapted to the limited supporting sciences and technologies of their time; and for this reason, their failure tells us very little about the potential success of modern ideas about AGI.
While explicit progress toward AGI has been slow so far, recent decades have seen dramatic progress in allied supporting areas, including computing hardware, software technology, computer science and cognitive science. Many of the algorithms and data structures that we are using within OCP simply could not have been used in practice 10 years ago, due to limitations in RAM and processor power. Advances in cognitive science during the last two decades have given us deep additional insights into numerous areas related to AGI, such as language learning, creativity, and the cognitive role of embodiment. Computer science and AI advances in areas such as automated program learning and Bayesian inference have also been highly significant, and have played a large role in the development of the particular algorithms used in OCP.
In short, within the OCP design we are leveraging recent developments in a number of areas to do things that could never have been done before. These advances in allied areas, in themselves, do not add up to a workable approach to AGI — but, we suggest, they do lead to the possibility of creating a workable AGI design, if one begins with an appropriate overall cognitive theory and then works out its practical AGI implications in a manner appropriately concordant with recent technologies and discoveries.
Given the way the current science and technology scene is primed for AGI progress, it may seem surprising that there currently is no existing project, in academia or industry or the nonprofit R&D sector, in which a team of qualified and experienced AI professionals is directly working on engineering human-level AI in the near term. Yet, such is currently the case. There are many reasons for this, but a central one is that, prior to OpenCogPrime, no comprehensive, detailed software design for human-level AI has been publicly proposed (with one partial exception, Cyc, that will be discussed below).
One may ask why no such design has been proposed before OpenCogPrime. Again, there are many reasons, but we suspect a central one is sociological: due to the relative unpopularity of research on human-level AI with contemporary funding institutions, few AI researchers have devoted substantial time to the very difficult task of creating such designs, choosing to focus instead on other sorts of AI research for which funding and incremental positive feedback are far more readily available. For the last two decades at least, the vast bulk of AI funding has gone into "narrow AI," meaning AI projects that address highly domain-specific or task-specific problems rather than general-purpose intelligence. A more thorough discussion of this point, and other related issues concerning the sociology of the AI field, are discussed in the Introduction to the volume Artificial General Intelligence (Goertzel and Pennachin, Editors, Springer 2006).
There are, in the academic domain, a number of laboratories engaged in more limited-scope research projects directly concerned with human-level AI. An incomplete list would include the classic SOAR and ACT-R projects, Pei Wang's NARS system, Nick Cassimatis's PolyScheme, IBM's Joshua Blue, and Stan Franklin's LIDA, all reviewed in the Artificial General Intelligence volume mentioned above. The PI is acquainted with the leaders of these labs (and others), and plans to invite several of them to serve as Scientific Advisors of the proposed project. While these projects are highly valuable and embody serious insights, none of them are specifically aimed at the near-term engineering of human-level AGI; and it is the PI's assessment that none of them is based on a truly comprehensive design or theory of human-level AGI. However, there is certainly much room for conceptual cross-pollination between these projects (and others not listed here) and the proposed work.
In recent years there have been occasional efforts by US government funding agencies to support human-level-AI related work. However, for various practical reasons, these have not succeeded in directing significant funds to research teams addressing the problem. Most notable was the DARPA project for Biologically-Inspired Cognitive Architectures (BICA), which was initiated in 2003 as an ambitious multi-year initiatives, but then cancelled 18 months later, after funding a number of teams to carry out the initial parts of their project. The majority of these projects then had to be abandoned due to lack of funding.
Perhaps the closest thing to a competing, current project aimed directly at human-level AGI is the Cyc project, which has been in operation since the mid-1980's and is led by noted AI researcher Doug Lenat. Cyc does have a comprehensive architecture aimed at human-level AGI, but it seems fair to say that by this point, the assumptions underlying Cyc's architecture have been thoroughly examined and discredited by the vast majority of the AI community. In essence, Cyc consists of a logic engine attached to a large, hand-created knowledge base. In the view of nearly all AI researchers, neuroscientists and cognitive scientists currently practicing, this is plainly an inadequate approach to achieving human-level AI; and the large amount of money and effort expended on Cyc during the last decades would seem to support this skeptical judgment. However, the belief underlying the Cyc project seems to be that once a sufficient amount of knowledge is entered into the knowledge base, the architecture will achieve human-level AI in spite of the relative naivety of the architecture. In our view, the smaller-scale, academic human-level AI projects listed above all have a far more realistic and nuanced understanding of the issues that must be addressed to create a human-level AI system.
There is also a handful of low-profile, secretive commercial projects aimed at human-level AI. Two examples are A2I2 (led by Peter Voss) and Self-Aware Systems (led by Stephen Omohundro). The PI has made a serious effort to keep abreast of such developments insofar as is possible, and his current impression (based on the limited information available) is that these secretive efforts are currently less advanced toward human-level AI than several of the smaller-scale academic projects listed in the prior paragraphs.
In short, the proposed project, once funded, will constitute by far the planet's most serious concerted attack on the human-level AGI problem.
An incomplete sampling of publications related to OpenCogPrime and (mainly) the related Novamente Cognition Engine is as follows:
- Goertzel, Ben, Matthew Ikle', Izabela Freire Goertzel and Ari Heljakka (2008). Probabilistic Logic Networks. Springer Verlag
- Goertzel, Ben (2008a). OpenCog Prime: Design for a Thinking Machine. Online wikibook, at http://opencog.org/wiki/OpenCogPrime
- Goertzel, Ben (2006). The Hidden Pattern: A Patternist Philosophy of Mind, Brown-Walker Press
- Goertzel, Ben (2002). Creating Internet Intelligence. Plenum Press
- Goertzel, Ben . A Pragmatic Path Toward Endowing Virtually-Embodied AIs with Human-Level Linguistic Capability, Special Session on Human-Level Intelligence, IEEE World Congress on Computational Intelligence (WCCI) Hong Kong, 2008
- Goertzel, Ben and Pennachin, Cassio . An Inferential Dynamics Approach to Personality and Emotion Driven Behavior Determination for Virtual Animals. The Reign of Catz and Dogz Symposium, AI and the Simulation of Behavior (AISB), Edinburgh, 2008
- Goertzel, Ben, Cassio Pennachin, Nil Geissweiller, Moshe Looks, Andre Senna, Ari Heljakka, Welter Silva, Carlos Lopes . An Integrative Methodology for Teaching Embodied Non-Linguistic Agents, Applied to Virtual Animals in Second Life, in Proceedings of the First AGI Conference, Ed. Wang et al, IOS Press
- Goertzel, Ben and Stephan Vladimir Bugaj . Stages of Ethical Development in Artificial General Intelligence Systems, in Proceedings of the First AGI Conference, Ed. Wang et al, IOS Press
- Ikle', Matthew and Ben Goertzel . Probabilistic Quantifier Logic for General Intelligence: An Indefinite Probabilities Approach, in Proceedings of the First AGI Conference, Ed. Wang et al, IOS Press
- Hart, David and Ben Goertzel. OpenCog: A Software Framework for Integrative Artificial General Intelligence, in Proceedings of the First AGI Conference, Ed. Wang et al, IOS Press
- Ikle', Matt and Ben Goertzel. Indefinite Probabilities for General Intelligence, in Advances in Artificial General Intelligence, IOS Press.
- Goertzel, Ben. Virtual Easter Egg Hunting: A Thought-Experiment in Embodied Social Learning, Cognitive Process Integration, and the Dynamic Emergence of the Self, in Advances in Artificial General Intelligence, IOS Press.
- Heljakka, Ari, Ben Goertzel, Welter Silva, Izabela Goertzel and Cassio Pennachin. Reinforcement Learning of Simple Behaviors in a Simulation World Using Probabilistic Logic, in Advances in Artificial General Intelligence, IOS Press.
- Goertzel, Ben and Stephan Bugaj (2006). Stages of Cognitive Development in Uncertain-Logic-Based AI Systems.iin Advances in Artificial General Intelligence, IOS Press.
- Goertzel, Ben, Ari Heljakka, Cassio Pennachin, et al. Probabilistic Logic Based Reinforcement Learning of Simple Embodied Behaviors in a 3D Simulation World, Proceedings of International Symposium on Intelligence Computation and Applications (ISICA) 2007
- Goertzel, Ben, and Matthew Ikle'. Assessing the Weight of Evidence Implicit in an Indefinite Probability. Proceedings of International Symposium on Intelligence Computation and Applications (ISICA) 2007
- Looks, Moshe and Ben Goertzel (2006). Mixing Cognitive Science Concepts with Computer Science Algorithms and Data Structures: An Integrative Approach to Strong AI, AAAI Spring Symposium, Cognitive Science Principles Meet AI-Hard Problems, San Francisco 2006
- Goertzel, Ben, Moshe Looks, Ari Heljakka, and Cassio Pennachin (2006). Toward a Pragmatic Understanding of the Cognitive Underpinnings of Symbol Grounding, in Semiotics and Intelligent Systems Development, Edited by Ricardo Gudwin and JoÃ£o Queiroz, Eds., 2006
- Goertzel, Ben, Hugo Pinto, Ari Heljakka, Michael Ross, Izabela Goertzel, Cassio Pennachin. Using Dependency Parsing and Probabilistic Inference to Extract Gene/Protein Interactions Implicit in the Combination of Multiple Biomedical Research Abstracts, Proceedings of BioNLP-2006 Workshop at ACL-2006, New York
- Goertzel, Ben, Ari Heljakka, Stephan Vladimir Bugaj,âšCassio Pennachin, Moshe Looks, Exploring Android Developmental Psychology in a Simulation World, Symposium "Toward Social Mechanisms of Android Science", Proceedings of ICCS/CogSci 2006, Vancouver
- Looks, Moshe, Ben Goertzel, and Cassio Pennachin, Learning Computer Programs with the Bayesian Optimization Algorithm, Genetic and Evolutionary Computation Conference (GECCO), 2005.
- Goertzel, Ben, Cassio Pennachin, Andre Senna, Thiago Maia and Guilherme Lamacie (2003). Novamente: An Integrative Architecture for Artificial General Intelligence. Proceedings of IJCAI-03 Workshop on Agents and Cognitive Modeling, Acapulco, August 2003
Appendix: Key Interactions Between OpenCogPrime Learning Mechanisms
Expanding the Technical Background section above, this Appendix gives a systematic enumeration of key interactions between OCP cognitive processes. Some of these interactions have been demonstrated empirically to date, and some have not, due to the incomplete status of the OCP implementation. According to the theory underlying OCP, these interactions are what will allow the system to carry out generally intelligent conversations in a virtual world without succumbing to the combinatorial explosions that plague traditional AI approaches.
| How →
|Map Formation||Goal System||Simulation||Sensorimotor pattern recognition|
|PLN||Creates new concepts and predicates, enabling briefer useful inference trails||Goal refinement enables more careful goal-based inference pruning|| - Simulations provide a method of testing speculative inferential conclusions
- Simulations suggest hypotheses to be explored via PLN
|Creates new concepts and predicates, enabling briefer useful inference trails|
|MOSES||Creates new procedures to be used as modules in candidate programs||Goal refinement allows more precise definition of fitness functions, making MOSES’s job easier||Simulation provides a method of “fitness estimation” allowing inexpensive testing of candidate programs||Extraction of sensorimotor patterns allows creation of abstracted fitness functions for (inferentially and simulatively) evaluating MOSES programs guiding real-world actins|
|ECAN||Creates nodes grouping “attentionally related” Atoms, enabling ECAN to find subtler attentional patterns involving these nodes||Goal refinement allows more useful spreading of importance within ECAN||Simulation provides data for ECAN -- allowing HebbianLinks to be extracted from co-occurrences observed in simulation||Creates nodes grouping “attentionally related” Atoms, enabling ECAN to find subtler attentional patterns involving these nodes|
|Concept Creation||Creates new concepts to be fed into other concept creation mechanisms||Goal refinement provides more precise definition of criteria via which new concepts are created||Utility of concepts may be assessed via creating simulated entities embodying the new concepts and seeing what they lead to in simulation||Creates new concepts to be fed into other concept creation mechanisms|
| How →
|PLN||NA||When PLN gets stuck in an inference trail, it can ask MOSES to learn new patterns regarding concepts in the inference trail (if there is adequate data regarding the concepts)||Importance levels allow pruning of PLN inference trees||Provides new concepts, allowing briefer useful inference trails|
|MOSES|| - Guides approximate procedure normalization
- Guides probabilistic modeling of population of candidate programs
- Allows inferential fitness estimation
|NA||Importance levels may be used to bias the fitness evaluation and representation-building phases of MOSES||Provides new concepts, allowing compacter programs using new concepts within conditionals|
|ECAN||Enables inference of new HebbianLinks and HebbianPredicates from existing ones||MOSES can learn patterns in the System Activity Table, which are then used to build HebbianPredicates||NA||Combination of concepts formed via map formation, may lead to new concepts that even better direct attention|
|Concept Creation||Allows inferential assessment of the value of new concepts||MOSES can be used to search for high-quality blends of existing concepts (using e.g. PLN and ECAN as the fitness functions)||Allows importance-based assessment of the value of new concepts||NA|
| How →
|Map Formation||Speculative inference can help map formation guess which maps to hunt for||MOSES can be used to search for maps that are more complex than mere “co-occurrence”||ECAN provides the raw data for map formation||No significant direct synergy|
|Goal System||PLN can carry out goal refinement||No significant direct synergy||Flow of importance among subgoals determines which subgoals get used, versus being forgotten||Concept creation can be used to provide raw data for goal refinement (e.g. a new subgoal that blends two others)|
|Simulation||In order to provide data for setting up simulations, inference will often be needed||No significant direct synergy||ECAN tells which portions of a simulation need to be run in more detail||No significant direct synergy|
|Sensorimotor pattern recognition||Speculative inference can help map formation guess which maps to hunt for|| MOSES can be used to find subtle patterns in sensorimotor data
||ECAN guides pattern recognition via indicating which sensorimotor stimuli and patterns tend to be associatively linked||New concepts may be created that then are found to serve as significant patterns in sensorimotor data|
| How →
|Map Formation||Goal System||Simulation||Sensorimotor pattern recognition|
|Map Formation||NA||Map formation may focus on finding maps related to subgoals, and good subgoal refinement helps here||No significant direct synergy||No significant direct synergy|
|Goal System||Concepts formed from maps may be useful raw material for forming subgoals||NA||No significant direct synergy||No significant direct synergy|
|Simulation||No significant direct synergy||No significant direct synergy||NA||Presence of recognized sensorimotor patterns may be used to judge whether a simulation is sufficiently accuinteractivity|
|Sensorimotor pattern recognition||Concepts formed from maps may usefully guide sensorimotor pattern search||Directing pattern search toward patterns pertinent to subgoals, may make the task far easier||Patterns recognized in simulations may then be checked for presence in real sensorimotor data||NA|