This roadmap is deprecated in favour of Roadmap. It may still contain relevant bits of information however so feel free to update this page to reflect the current roadmap.
OpenCogPrime Roadmap 2009-2011
Proposed work schedule by month: http://spreadsheets.google.com/pub?key=pT15xTF3ys-2F1m1_ottBYA&output=html
Goals of the Proposed Research
The proposed research is viewed as constituting the early stages of a longer-term project aimed at the creation of a human-level AGI (artificial general intelligence) system – one sharing the human mind’s capability for commonsense inference and everyday general intelligence, but also going beyond human capabilities in various ways enabled by its digital computer infrastructure and highly precise algorithms. The scope of applications of human-level AGI systems is obviously tremendous, pervading all aspects of science, engineering, commerce and humanities. Among other applications, the creation of artificial scientists and engineers would vastly accelerate technological progress along multiple avenues; and, the creation of AGI’s that are themselves students of AGI holds the possibility of a massive acceleration of artificial intelligence, as AGIs get smarter and smarter at making themselves smarter and smarter.
To get to these wonderful longer-term goals, however, we must proceed via incremental steps, each of which provides coherent, easily testable and demonstrable functionality of its own. This roadmap describes two incremental steps that constitute significant progress along the path toward AGI with capability at human level and beyond, but also possess significant interest and value in their own right. The composite goal of these two steps (Phase 1 and Phase 2) is to create an AGI-controlled virtual agent in an online virtual world, that can carry out various tasks in the world and hold simple conversations regarding its world and its interactions with the objects and agents in it. This AGI is supposed to demonstrate human-like commonsense knowledge, at the rough level of a young child, within the domain of its virtual world (a world which, while lacking the richness of the real physical world, is vastly richer than the various toy domains typically used by AI researchers; and we have put considerable thought into how to create a virtual world sufficiently rich and robust to support the various world-interactions present in, for example, a preschool or a child psychological testing lab).
The goal of Phase 1 of the roadmap is to create an artificial general intelligence (AGI) system capable of controlling a humanoid agent in a 3D simulation world, in such a way that this agent can carry out a simple English conversation about its interactions with its simulated environment. This will be done by building on the current OpenCog and RelEx software systems, guided by the OpenCogPrime software design.
A simple example conversation of the sort envisioned at the end of Phase 1 would be as follows:
(i.e. human- controlled avatar)
|What do you see on the table?|
(i.e. OCP-controlled avatar)
|A red cube, a blue ball, and a lot of small balls|
|Human||Which is bigger, the red cube or the blue ball?|
|OpenCog||The blue ball|
|Human||Bring me one of the small balls|
|OpenCog||(walks to table, picks up ball, and brings it to human-controlled avatar)
Here it is
|Human||(takes the ball to another table, where there are three cups turned upside down, one red, one green, and one blue)
I’m going to put the ball under the red cup
(does so, and then shifts the cups around a bit to various different positions)
Where is the ball?
|OpenCog||Under the red cup|
(additional agent, could be human or AI controlled)
|Human||OpenCog, if I ask Bob which cup the ball is under, what will he say?|
|OpenCog||I don’t know|
|OpenCog||He didn’t see you put it under the cup.|
This sort of dialogue seems extremely simple by the standard of ordinary human life, but at present there is no AI system that displays remotely this level of commonsensically savvy conversation, in a reliable and regular way. Yet, all the ingredients needed to support this are present in the OpenCog and RelEx software systems the PI and his colleagues have created over the last 7 years – now it’s “just” a matter of doing some extending, fine-tuning and integrating.
We stress that our goal is not to achieve a few specific examples of this sort of dialogue that can be showcased in a research paper or conference presentation. This has been done before, and could be done using OpenCog with far less effort than the project proposed here. Our goal is to achieve this level of virtual-world-related conversation in a truly robust and flexible way, via using an architecture that is specifically built with more ambitious goals in mind. We also note that, while Phase 1 is an important intermediary step, it is Phase 2 that will demonstrate functionality that dramatically and excitingly goes beyond anything previously obtained in the AI field. What is important in Phase 1 is not so much the behavior achieved, as the fact that the behavior achieved is being achieved using a flexible, growable general intelligence architecture that can be improved to yield the Phase 2 behaviors, and then improved far beyond that level.
Phase 2 will extend the Phase 1 behaviors significantly, via enabling the agent to carry out complex, interactive, temporally sequenced “physical” tasks in the virtual environment, which will also extend its linguistic capabilities due to providing critical additional experiential grounding for its language abilitites.
A simple example of the kind of activity envisioned as possible at the end of Phase 2 is as follows:
||Look over there on the other side of the table. What do you see?|
||Bill and Bob are there.|
|Human||What are they doing?|
|OpenCog||Bill is throwing a red ball to Bob. Bob is throwing a red ball to Bill.|
|Human||Use these blocks on the table to show me what they’re doing.|
|OpenCog||(at the table, stacks up several red blocks to make a “simulated Bob”, and several blue blocks to make a “simulated Bill”, and a small green ball to make a “simulated ball” ).
|Human||(Pointing at the stack of red blocks)
What’s that supposed to be?
|OpenCog||(Points to Bill)
|Human||Very funny. I meant: which blocks are Bill?|
|OpenCog||(Pointing at the stack of blue blocks)
These are Bill
|Human||Can you use the blocks to show me what Bill and Bob are doing?|
|OpenCog||(Takes the green ball and moves it back and forth in the air between the Bob-blocks and the Bill-blocks, to illustrate the game of catch that Bob and Bill are playing)
They’re throwing the ball to each other like this.
Note the difference here from Phase 1: there’s more than just cognizant conversation, locomotion and simple object manipulation going on, there’s systematic, purposeful, planned manipulation of objects in an interactive social context.
Again, we stress that our goal is not to achieve a handful of examples of this sort of conversation, via carefully customizing or tuning our AI system in an example-specific way. Rather, we seek to achieve this level of functionality in a robust and flexible way. It is important that we are achieving this functionality using a general intelligence architecture created with further expansion in mind. However, further progress aside, we believe that the Phase 2 functionality will be considered sufficiently exciting by the AI community to be taken as a demonstration of dramatic achievement, and of the unprecedented promise of the OpenCogPrime approach to AGI.
The ordering of these two phases may seem peculiar as compared to human cognitive development, but this appears the most natural order in which to proceed given the specific technology bases available to drive development at present.
The specification of precise intelligence tests suitable for measurement of the system’s abilities during each phase is part of the research project itself; but clear guidelines for the creation of such tests have been formulated in recent publications such as the “AGI Preschool” that Ben Goertzel and Stephan Bugaj will present at AGI-09.
Brief Overview of Work Required
Phase 1 will result in an AI system that controls a humanoid agent living in a customized version of the OpenSim virtual world, and carries out simple English conversations about the objects in the world around it, and its interactions therewith. The agent will interact with objects via grasping, pointing, carrying and executing other simple actions, as well as communicating about what it sees and what it and other agents are doing.
The specific work proposed in Phase 1, toward this goal, falls into the following categories:
- Language processing: Enabling intelligent language functionality and sophisticated cognitive guidance of linguistic processing. When such processing exists, the relevant portions of RelEx language comprehension and NLGen generation algorithms can be integrated into the OpenCog framework. See NLP for current status and details.
- Inference: completion of the implementation of the PLN (Probabilistic Logic Networks) inference framework within OpenCog, and customization and tuning of PLN for inference on the output of RelEx
- Attention allocation: completion and implementation of the ECAN economic attention allocation framework within OpenCog
- Dialogue management: implementation within OpenCog of a controller process for managing conversational dialogue, drawing on linguistic and inferential functionality
- Scalability: improvement of the OpenCog infrastructure to support distributed processing, and real-time responsive scheduling
- Virtual embodiment: extension of OpenCog’s current virtual embodiment framework, which has been customized for simplistic virtual pets, to handle more flexibly controllable virtual humanoid agents, via integrating an open-source robot simulator with the OpenSim virtual world
Phase 2 will extend the Phase 1 system to create an embodied agent that carries out complex interactive tasks in the virtual world, communicating about what it does, but also doing systematic planning and reasoning about its actions and communications. The specific technical work proposed in Phase 2 falls into the following categories:
- Language processing: deep integration of PLN inference into the OpenCog language processing framework, allowing parse selection, word sense disambiguation and reference resolution to be carried out in a contextually sensitive way using inferential judgment
- Inference: implementation, testing and tuning of spatiotemporal inference using PLN, allowing reasoning about complex sequences of actions (as coordinated with linguistic knowledge)
- Procedure learning: extension of procedure learning code to allow for
- Attention allocation: integration of attention allocation with action selection and procedure execution, allowing the real-time execution of complex, learned embodied procedures
- Concept formation: implementation of heuristics for the formation of novel concepts, based on combining prior concepts and on recognizing patterns in the current network of knowledge. Includes basic conceptual blending.
- Integrative Intelligence: tuning of the above cognitive procedures to work together effectively, enhancing each others intelligence
- Dialogue management: adaptation of conversational patterns using procedure learning based on MOSES, PLN and attention allocation
- Scalability: additional of specialized distributed processing techniques for allowing rapid distributed PLN inference and MOSES procedure learning
- Virtual embodiment: implementation of “bead physics” into the OpenSim virutal world, thus allowing the creation of complex textures, masses, strings and other entities alongside the typical virtual-world objects; implementation of a more flexible body for the AI using bead physics as well
Preliminary Task Breakdown
This section presents a rough work plan, explaining specific tasks that need to be done during the course of the project. For the purpose of this roadmap a 6-monthly breakdown was judged adequate, but a more detailed breakdown can be supplied if this is deemed useful.
In the Phase 2 table, a distinction is made between tasks that can be done in parallel to Phase 1 (but are not needed for the Phase 1 deliverable) versus tasks that really need Phase 1 to be completed in order to make sense as tasks.
|inference||Completion of adaptive PLN forward and backward chainers||Tuning of PLN for inference on the output of RelEx||Customization of PLN for real-time inference during dialogue|
|“||Implementation of PLN rules for intensional and spatiotemporal inference||Implementation of PLN rules for social and epistemic inference|
|“||Testing and tuning of PLN formulas for intensional and spatiotemporal inference||Testing and tuning of PLN formulas for social and epistemic inference|
|attention||Completion of Economic Attention Networks (ECAN) framework for adaptive attention allocation||Integration of ECAN with PLN and RelEx, to enable adaptive inference & language comprehension||Customization of ECAN for real-time attention allocation during dialogue|
|language||Full integration of RelEx and NLGen into OpenCog||Creation of Dialogue Manager||Tuning, testing, refinement of NLP components|
|System scalability and performance||Extension of OpenCog framework to allow effective distributed processing across multiple multiprocessor machines||Implementation of real-time scheduler for OpenCog, to allow rapid responsive processing during dialogue||Code optimization as needed to support real-time dialogue|
|Perception and action||Modification of current PetBrain code to control flexible humanoid agents in virtual world||Improvement of perception and action code to allow more complex, realistic behavior||continued|
|virtual world||Building of appropriate testing/teaching environment in virtual world||Customization of OpenSim to allow external processes to control avatars||Integration of Gazebo robot simulator with OpenSim to allow more flexible avatar control|
- Language integration: Currently, there exist ways of piping RelEx output into OpenCog; however, there is no clean, seamless environment for directing input from a chat system, in real time, to OpenCog for processing. There is no language output at all at this point. Furthermore, RelEx currently has considerable trouble with short, chatty conversational elements: it expects well-formed, grammatical, full sentences as input. Non-grammatical, mis-spelled phrase fragments, as commonly encountered in chat, are unsupported.
|Procedure learning||Yes||Extension of MOSES procedure learning code to handle complex programmatic constructs||Experimentation with learning of complex procedures using MOSES||continued|
|“||Yes||Replacement or augmentation of BOA within MOSES by a more powerful learning method that accounts for prior history|
|inference||No||Design and testing of integration of MOSES with PLN||continued||continued|
|“||No||Testing and tuning of PLN as applied to spatiotemporal task planning||continued||continued|
|attention||No||Integration of economic attention allocation with procedure execution, to allow intelligent adaptive procedure execution||Integration of ECAN with PLN and RelEx, to enable adaptive inference & language comprehension||Customization of ECAN for real-time attention allocation during dialogue|
|language||No||Utilization of PLN to enable experiential language learning||Integration of language learning system with||Tuning, testing, refinement of NLP components|
|integrated intelligence||Partially||Design and implementation of concept formation and goal refinement heuristics||Integration of PLN, MOSES, attention allocation, concept formation, goal refinement, etc.||continued|
|System scalability and performance||No||Implementation of specialized framework for distributed PLN inference||Implementation of specialized framework for distributed MOSES procedure learning||System optimization|
|Perception and action||partially||Creation of motion learning code capable of learning new movements in the manner of a robot simulator||Creation of perception code allowing the system to identify novel objects||Full integration of perceptual and motor functionalities|
|virtual world||yes||Customization of ODE physics engine to support “bead physics”||Integration of bead physics into OpenSim to allow more flexible environment||continued|