OpenCog Development Roadmap
A Draft OpenCog Development Roadmap
Updated Dec 2015
This page is a “living document” that roughly outlines a roadmap for the next phases of OpenCog development. Three phases of near-term development are proposed, covering several different practical applications and a wide variety of internal algorithms and structures and their interactions.
(This page was written initially by Ben Goertzel in December 2015, drawing heavily on earlier documents created by Cassio Pennachin, which in turn were based on discussions between Ben, Cassio, Jim Rutt, Linas Vepstas and many other Opencoggers.)
A word on the scope of the roadmap presented here may be helpful. The OpenCog architecture and algorithms generally have been conceived and designed with the goal of general intelligence at the human-level and beyond. The 2014 book Engineering General Intelligence outlines a design called CogPrime, which constitutes a specific approach to creating human-level-and-beyond AGI within the OpenCog platform. However, in the roadmap outlined here, the focus is more on the next few steps along the path toward the broad goal of advanced AGI. The steps outlined here are consistent with the CogPrime design, but also mostly have the potential to be useful within other AGI approaches that might be implemented within the OpenCog framework.
(For those concerned with the future potential for AGI at the human level and beyond, however, we will also give a few brief comments on the longer-term OpenCog/CogPrime roadmap below.)
Goals of the Next Three Development Phases
The three phases of development described here are intended to bring OpenCog from its current state to the point where it displays
- Exciting demonstrability: Displays excitingly demonstrable behaviors that exemplify the key principles underlying the OpenCog architecture (e.g. cognitive synergy between types of learning corresponding to different types of memory; use of a weighted-labeled-hypergraph knowledge to combine different types of knowledge in a flexible and pragmatic way; use of OpenCog to control an embodied intelligent agent)
- Relatively easy usability: Delivers dramatic value to, but requires only modest effort, from: AI developers who want to implement their algorithms within OpenCog (so as to leverage synergy with OpenCog’s other algorithms, or make use of OpenCog’s representational mechanisms), and application developers who want to use OpenCog to provide added intelligence to the products or services they are building
The OpenCog leadership strongly suspects that once the project achieves exciting demonstrability and relatively easy usability, then -- given the generally favorable climate for AI these days -- from this point onward, various sorts of resources will agglomerate themselves onto the project and a massive acceleration in progress will occur (building on the foundation of the demonstrable/usable codebase plus the fundamentally sound and highly extensible underlying design).
On this page, we have intentionally not committed to a specific timeline for the three phases described. Timing of development for an OSS project like OpenCog depends on multiple factors including degree of funding and volunteer commitment. That said, our current estimation is that each phase could be completed in about a year with a reasonable amount of funding for a core team, augmented by volunteer efforts. If funding and volunteer effort fall short of our hopes, the roadmap still makes sense but the timing will be longer.
OpenCog can be applied to a virtually unlimited variety of areas and applications. The roadmap described here is focused on the following three applications, where the first gets more focus than the latter two:
- [D] -- Control of a robot head (simulated via Blender, or a physical Hanson robot) carrying out a dialogue with a human (including visual perception of its surroundings, and emotional interaction with its conversation partner (see here for some early code using OpenCog to control a Hanson robot head)
- [U] – Replacement of rule-based, supervised-learning-based or hand-coded components of OpenCog utilized in the initial robot-head dialogue system ([D] above), with components that do their thing based on unsupervised learning from experience. Three key aspects are addressed here: language-comprehension rules, vision processing based on labeled training data, and hard-coded inference control heuristics. (See here for some code giving a very early-start toward unsupervised grammar learning.)
- [B] -- Analyzing genomics (SNP and microarray data), and other related biological datasets (see here for some code for importing bio datasets into OpenCog and analyzing them; and here for an R interface to MOSES customized for bio applications)
- [G] -- Playing Minecraft and similar games (e.g. Space Engineers, and OSS Minecraft-like games) ( see here for code giving an OpenCog-Minecraft bridge). Note: A Minecraft Bot Development Roadmap is in progress.
Tasks in the below lists will be labeled with (one or more of) D, L B or G according to which of the above areas they are critical for.
A few tasks not critical for any of the above are also placed on the roadmap, because they are seen as generally useful for OpenCog development, or for other envisioned OpenCog applications.
Within each of the three phases described here, the high-level tasks described are broken down into multiple categories: Systems & Tools, Community & Process, PLN, NLP, Learning, ECAN, Perception, Memory, Action, Cognitive Control. This categorial breakdown is basically for convenience and is not intended to represent a profound ontology of cognitive processes or anything like that.
While a specific ordering of tasks into Phases 1, 2 and 3 is provided here, this ordering is not always iron-clad. Sometimes a task is put into a later phase because it depends on tasks done in earlier phases, but occasionally a task is put into a later phase just because it seems less urgently critical. It is possible that some later-phase tasks will end up getting done before some earlier-phase tasks, based on changing practical goals, or the exigencies of funding sources or volunteer enthusiasm, etc. What is presented here is simply one apparently sensible way of breaking down a large amount of complex pending work into sequences and categories, so as to provide higher-level guidance to the next stages of OpenCog development.
(NOTE: A fairly serious attempt has been made, in the below, to link to relevant resources on the wiki and in Github, for the various tasks cited. Mostly, though, what has been done is to link to a resource the FIRST TIME it is relevant in the task list below. When a certain resource is relevant to task A, and then to task B that is obviously a continuation of task A, the resource is often not linked to B, but only to A. Readers: Feel free to introduce additional linkages into the wiki, Github or elsewhere as you judge relevant!)
Notes on the Longer-Term OpenCog Roadmap
Before launching into the particulars of the three-phase, near-term roadmap to be outlined here, we make a few brief comments on the positioning of this roadmap in regard to OpenCog's broader ambitions.
Viewed from a broad proto-AGI-systems perspective, achieving the goals of the three-phase roadmap outlined above would achieve a number of things, including
- exciting, "sexy" demonstrations, and an easily-usable set of interfaces, as highlighted above
- basic commonsense reasoning, using various OpenCog cognitive algorithms in an integrated "cognitive-synergetic" way
- scientific reasoning based on a combination of quantitative, linguistic and relational-DB information (in the bio domain specifically, but from an AI perspective the methods used will be perfectly applicable to any area of science, especially areas that, like biology, tend to be more data driven than mathematical theory driven)
- facility for natural language input and output, probably based on a mix of paradigms (e.g. some experiential/unsupervised learning at the foundation, probably still augmented by some content derived from rule-bases and supervised learning)
- facility for image and video processing using cognition-perception feedback, and a start on audio processing
- elements of perception-cognition-action synthesis, suitable to enable a push toward OpenCog-based robot body control
- a solid video-game-character control interface, suitable for ongoing experimentation with embodied cognition in a way that doesn't entail the complexities of dealing with physical robots
What we see overall in the above is a mix of practical demonstrability/usability, deep cognitive-process integration, and interaction with a variety of environments and data types.
It seems that the successful completion of these three phases of development would prepare OpenCog very effectively for use as the platform of a generation of what we might call "Proto-AGI expert systems." What we mean by this is: Systems that combine
- significant amounts of human-child-like common-sense knowledge about the everyday human world; with
- significant amounts of specialized intelligence focused on particular domains (e.g. control of particular humanoid robot bodies; analysis of genomics data; effective interaction in a particular game world or virtual world; or many other applications less closely tied to the present near-term roadmap)
Our vision is that, following the completion of the three phases described here, will come a phase where at the same time
- fundamental research focuses on improving the depth and scope and synergetic interaction of OpenCog's cognitive algorithms
- specific pure research focus is placed on endowing and instructing OpenCog systems with the capability for mathematics and computer programming (thus leading toward the possibility of profound self-modification and self-improvement, along technical lines discussed in Engineering General Intelligence)
- systems engineering focuses on increasing the scalability of the OpenCog infrastructure
- application development focuses on the creation of a variety of highly practically useful Proto-AGI expert systems.
Regarding the latter point: Note that this application development may be done by a variety of individuals employed by a variety of different large and small companies, universities, government agencies, etc. At this stage OpenCog should appear to the technology world as a broadly usable tool, which can be relatively easily and profitably incorporated in a variety of products and services -- much as supervised-learning "machine learning" toolkits are viewed today in 2015.
A detailed roadmap for development during this "Proto-AGI Expert system" phase will best be crafted while Phase 3 of the current roadmap is underway.
We envision that work on various Proto-AGI Expert systems, together with fundamental improvements to OpenCog algorithms and infrastructure as outlined above, will gradually lead to true human-level AGI. Exactly how long all these steps will take is difficult to foresee right now. To many of the OpenCog principals, the timeline of this proposed work appears roughly consistent with Ray Kurzweil's projections of human-level AGI by 2029. However, large infusions of funding or effort, or sudden research breakthroughs, could accelerate things surprisingly; or unforeseen roadblocks could occur, slowing things down. These are things that have never been done before, and uncertainties abound. But there is nothing more exciting or important to be working on.
High Level Goals
Overall: In Phase 1 we will focus on making cool working examples of OpenCog doing stuff, that are well-documented and easy to run and modify.
- System/Tools: The main high-level goal here is to have an OpenCog system that is reasonably easy to work with, for AI developers who want to add new AI functionalities to OpenCog, modify existing AI functionalities, or embed technical AI functionalities in a system they are building
- Have thorough, well-maintained basic-level tutorial materials for key aspects of OpenCog.
- Establish a good process for on-boarding newbies, including a well-maintained set of tasks appropriate for newbies just getting started
- Establish a process via which the documentation on the wiki and in the codebase is ongoingly up-to-date and reasonably-detailed
- Dialogue: A system that can hold a basic, very simple conversation via a (physical or simulated) robot head, including appropriate references to its visual observations. It’s OK if the system doesn’t know very much or carry out deep chains of reasoning. The main point is actually having a functional, integrated dialogue system using the “right” representations and overall architecture (for an overview of the architecture intended, see the chapter on "Dialogue" in Engineering General Intelligence, Volume 2 ( linked from here); also see these OpenCog_Dialogue_Application partial design thoughts).. Concurrently, initial experimentation aimed at figuring out how to replace the hard-coded rules inside the NLP comprehension system with learned rules/patterns.
- Unsupervised Learning: Toward the goal of removing hard-coded/supervised-learned components, in this phase we will be doing preliminary experiments on textual data using pattern mining and clustering, to figure out how these algorithms need to be tweaked to unsupervised language learning. Work on generally improving and tuning the various OpenCog learning algorithms will also be valuable in this direction.
- Memory: Effective, appropriate recollection of episodic memories based on a system’s experience. This requires basic event boundary recognition to work.
- Bio: An initial system for automated hypothesis validation and generation, covering hypotheses about the relations between genes/gene-categories and phenotypes. Specifically: A system that can do PLN inferences and pattern mining, combining the output of MOSES analysis with data loaded into the Atomspace from bio-ontologies, in a fully automated way. Also, preliminary work getting OpenCog competent at bio-NLP.
- Gaming: A system that can use basic planning and reasoning to control the goal-driven behavior of a character in Minecraft (or other selected Minecraft-like games).
System and Tools
- Optimize Atomspace performance and CogServer performance, including runtime, memory requirements and multithreading within a single process. [DUBG]
- Implement flexible coordination mechanisms and policies for multi-process computing. See Networked AtomSpaces. [DUBG]
- Implement flexible, efficient approach for storing additional Atom properties (besides truth value and attention value) (see ProtoAtom for the current proposal)[DUBG]
- Atomese preprocessor for human-friendly syntax that can be turned into valid Scheme, Python or Haskell for execution (or some other technical solution with similar result). The goal is to enable coding of Atomese in a generic, concise syntax, without so much “syntactic noise” resultant from wrapping the Atomese in one or another other scripting language. (see some speculative musings about this here [DUBG]
- Higher-order type system for Atomspace (likely some variant of the Haskell type system, providing a specific type signature for each SchemaNode or PredicateNode) (see SignatureLink for detailed, partial design suggestions. [DUBG]
- Initial design of cognitive API, enabling access of OpenCog functions by application developers who do not want to modify (nor need to deeply understand) the AI algorithms inside OpenCog ( see some preliminary thoughts here )
- Improve the Atomspace visualizer, to provide concise and easily/relevantly manipulable versions of Atom graphs common in NLP, biology and gaming applications [DUBG]
Community and Process
- Continue to maintain and improve OpenCog install and run tools (docker and ocpkg repositories), continuous integration setup, overall GitHub administration [DUBG]
- Create a comprehensive and highly clear set of tutorials covering all key aspects of OpenCog AI and system functionality (the current Hands On With OpenCog outline is a start in this direction).
- Expansion of wiki/markdown documentation so that it covers all important areas of the current codebase (and make sure the site does not contain obsolete information) [DUBG]
- Develop and maintain interesting demos, with the goal of attracting more community members .
- This will start with an NLP dialogue demo involving a simulated robot head that talks to the user. [D]
- Basic demos of Minecraft-or-similar gameplay and bio data mining will also be created, though are not expected to be as immediately exciting as the dialogue demo. [GB]
- Note, PLN has its own code, but is based heavily on the OpenCog Unified Rule Engine, which supplies general forward and backward chaining processes that PLN's logic rules utilize..
- Utilize PLN backward chaining for goal satisfaction, i.e. for inferring which procedures are likely to achieve a certain goal in a certain context, based on indirect knowledge [DG]
- Natural language question answering via the backward chainer [D]
- Integration of higher-order types into PLN, so that rule selection and premise selection can be done in a way that obeys type constraints [DUBG]
- Integration of distributional and indefinite truth values into PLN (allowing meaningful confidence rules) (see the original book "Probabilistic Logic Networks" (linked from Background_Publications here) for details on these truth values; see also the Truth Value Toolbox here) [DUBG]
- Implement/integrate temporal reasoning using fuzzy/probabilistic Allen Interval Algebra (python code for this exists from the old PLN implementation, it needs to be ported to the new PLN and the formulas improved) (see this page for links providing guidance on details) [DG]
- Integrate the PLN code for epistemic reasoning that was written and designed in Summer 2015 (see this page for a link to he code and docs) [DG]
- Hard-code inference control macros that make rule selection work well for selected applications (question answering, biological discovery). These heuristics may encode a probabilistic heuristic, but the heuristic and its weights won't be learned by the AI. [DUBG]
- Integrate the Quantitative Predicates code with PLN distributional truth values, and test on biological data [UBG]
- Tune PLN inference for generalization from MOSES models (see basic code for importing Boolean MOSES models into the Atomspace for PLN reasoning here )[UB]
- Implement sophisticated PLN rules for estimating the truth values of Boolean combinations (this is needed e.g. for implementing probabilistic programming in the Atomspace) ( see this page for some detailed ideas on how to approach this) [UB]
- Implement node truth value estimation code that compensates for the bias that the Atomspace tends to contain more interesting/surprising Atoms than would be randomly expected (see this discussion of the "noisy smokes example" for an explanation) [DUBG]
- Make sure all inference rules work effectively when restricted to a particular specified context (specified via ContextLink, and make the URE accept a context as an optional parameter when launched for forward or backward inference [DUBG]
- Continue the early experiments on unsupervised learning of link grammar dictionaries, that were begun during Google Summer of Code 2015 (see this paper for the underlying concepts, and here for some very partial code)
- Implement versions of these early experiments on unsupervised learning of link grammar dictionaries, using the Pattern_Miner and Atomspace-based clustering as tools
- Integrate OpenCog’s NLP comprehension pipeline with existing OSS biological entity and relationship extractors [U]
- Do preliminary testing/tuning of link parser, Relex and RelEx2Logic on biological abstracts, and improve rules as needed [B]
- Tune RelEx and RelEx2Logic for comprehension of sentences describing scenes, or series of actions, in a game world [G]
- Tune Microplanner for production of sentences describing series of actions in a game world [G]
- Tune RelEx and RelEx2Logic for comprehension of sentences describing scenes, or series of actions, in everyday robot-head dialogue [D]
- Tune Microplanner for production of sentences needed in everyday robot-head dialogue [D]
- Refactor aspects of the Pattern_Miner code, so that filters are specified as Atoms rather than in C++ code [DUBG]
- Development and tuning of Pattern_Miner filters and parameters for mining simple patterns in biological data [B]
- Development and tuning of Pattern_Miner filters and parameters for mining patterns of word co-occurrence in a corpus [U]
- Development and tuning of Pattern_Miner filters and parameters for mining patterns of block arrangement in a Minecraft (or similar) world [G]
- Implement probabilistic programming on the Atomspace, initially via [DUBG]
- Adding stochastic predicates to Atomese
- Implementing standard conditionalization queries using Monte Carlo search
- Implementing optimization queries using custom callbacks for the Pattern Matcher
- Implement clustering in the Atomspace (see one recent design sketch here
- Integrate blending with PLN forward chaining, so that both can occur together in the same forward-chaining cognitive process (so that forward chaining becomes a rich form of creative idea generation) [this will add creativity and value to all applications, yet is not a critical aspect for basic functionality in any of the DUBG applications considered in this roadmap]
- Integrate PLN into blending (see current [[ https://github.com/opencog/opencog/tree/master/opencog/python/blending | blending code here]]), so that PLN is used to assess the interestingness of the conclusions ensuing from a blend [this will add creativity and value to all applications, yet is not a critical aspect for basic functionality in any of the DUBG applications considered in this roadmap]
- Utilize the surprisingness formulas in the Pattern_Miner, to assess the surprisingness of Atoms in the Atomspace [DUBG]
- Pattern_Miner based event boundary recognition (based on the heuristic that event boundaries are discontinuities of predictability) [DG]
- Refactor ECAN so that it works effectively on large Atomspaces [DUBG]
- Split non-attentional focus processing into a background layer, possibly GPU-based
- Support multiple space maps , with linkages between them where appropriate (e.g. eye cameras, chest camera, hand cameras on a robot) [D]
- Support minecraft perceptions in way that doesn't interfere with robot head perceptions [G]
- Implement deep learning for vision processing, via wrapping external libraries in GroundedSchemaNodes; initially to be tested on supervised Object & Face recognition based on training data. See the current working design outline [D] [note that vision processing can also be useful in the gaming and bio domains, though is not critical for basic functionality there]
- Implement effective means of indexing episodic memories. Episodic memories may be stored in Atomspace, but we need to be able to access them associatively like the human mind does. Making sure ECAN learns HebbianLinks between the Atoms involved in a single demarcated episode, would seem one good way of doing this. [DG]
- Implement buffer structures for Working Memory: language buffer, spatial buffer, episodic buffer [DUG]
- Action orchestration: manage multiple action schemata at the same time, interleave slow and fast ones, etc. See the first-pass Action orchestrator design on the Hanson Robotics wiki site [DG]
- Behavior rules for engaging with humans in the visual field of the robot, actions that make the robot look alive, animations, etc (see various discussions on the Hanson robotics wiki site for insight here) [D]
- Play minecraft more effectively (debug and improve the existing OpenCog-Minecraft bridge, to make it support text chat, have memory of other players and sessions, etc) [G]
- Implementation of appropriate goals for robot head control, in OpenPsi [D]
- Implementation of appropriate goals for Minecraft character control, in OpenPsi [G]
- Tuning of action selection and other OpenPsi parameters for robot head control [D]
- Tuning of action selection and other OpenPsi parameters for Minecraft character control [G]
- Simple Discourse planning (macro and micro narratives) via speech act schemata for each context (see this page for some related design ideas) [DG]
High Level Goals
Overall: In Phase 2 we will focus on making the “cool working examples” from Phase 1 work better, and in particular, on making them demonstrate the unique principles underlying OpenCog, particularly “cognitive synergy” (interoperation of cognitive algorithms focused on different types of memory).
- Have an OpenCog system that is relatively straightforward to use for application developers in selected application domains, who want to use OpenCog AI as a tool but don’t need to modify or augment the underlying cognitive algorithms or representations
- Have an OpenCog that is scalable across dozens to hundreds of machines, for particular usage scenarios
- Community/Process: Have thorough educational materials for individuals who want to become OpenCog developers, covering beginner through inermediate level
- Dialogue: A dialogue system that has a bit more intelligence in drawing conclusions from its observations – making statements via combining observations with background knowledge, and inferring simple but not-explicitly-stated aspects of the human conversation partner’s intent. And the first real attempts to replace hand-coded syntactic rules with the results of learning.
- Unsupervised Learning: Toward the goal of removing hard-coded/supervised-learned components, in this phase we will start actually doing serious experiments on large amounts of textual data, oriented toward using OpenCog learning algorithms to learn grammatical rules. We will also do preliminary experiments using pattern mining, clustering and PLN to learn inference control rules, and to provide cognitive feedback to deep learning vision algorithms.
- Bio: More complex hypothesis formation and validation via bringing together PLN, MOSES and pattern mining in synergetic ways on biological data/knowledge. Integration of simple instances of bio relationships extracted from text into inferences.
- Gaming: A system that can creatively build things (in a Minecraft-like world) to achieve its goals, e.g. using concept blending and planning to figure out what to build…. A system that can communicate about what it’s doing in the game world, and describe what others are doing in the game world.
System and Tools
- Extend and optimize distributed computing functionality, including distributed Atomspace performance and policies as well as distributed cognitive processing on top of existing Gearman, Pattern Miner and MOSES distribution functionality. [DUBG] See
- Networked AtomSpaces for a definition of general terms and core concepts.
- this PDF for design ideas regarding making the OpenCog Atomspace distributed (Obsolete).
- The "Distributed Processing" section of the MOSES man page for info on how to run MOSES across multiple machines
- Pattern_Miner#Steps_to_run_a_distributed_pattern_miner_test here to learn how to run a test of the Pattern_Miner's distributed multi-machine capability
- Extend and improve the Atomspace visualizer, including high-level visualization of specific sets of Atoms (e.g., PSI goals, emotions and context; behavior tree forests; etc) [DUBG]
- Implementation of significant elements of the Cognitive API
- Modify Pattern Matcher internals to make them use parallel backtracking (many randomized parallel backtracking algorithms exist). Different algorithms might be optimal for a vanilla SMP machine versus a more exotic architecture like Intel Xeon Phi. [DUBG]
Community and Process
- Begin accumulating video lectures on specific aspects of OpenCog, toward the eventual creation of an OpenCog MOOC
- Action sequence planning (this includes the work to port Shujing's planner to the URE; see here for an overview of this planner; and here for the old planner code, not in the main OpenCog repo anymore) [DG]
- PLN inference for word sense disambiguation [D]
- Extend anaphor resolution code to use PLN for selecting anaphora [D]
- First step toward adaptive, history-based inference control [DUBG]
- Implement Markov Chain based inference history mining, incorporating association of chains of inference rules with specific contexts
- Wrap this in a probabilistic programming framework
- Tune distributed TVs for quantitative data analysis using PLN [B]
- Biological inference: Utilize PLN analogical reasoning to transfer conclusions from one organism to another [B]
- Implement and integrate spatial inference using fuzzy/probabilistic RCC-3D [G]
- Improvements to Sureal's sentence matching [D]
- extending SuReal for fragment matching and then gluing fragments together (see the chapter on "Language Generation" in the book "Engineering General Intelligence" for some relevant concepts here)[D]
- Extend anaphor resolution code to handle nominal as well as pronominal anaphora [D]
- Language generation without matching, as plan B. This involves creating a set of URE rules that remove the need for relex, and map link parser output directly to Atoms. We can then run these rules in reverse and use CSP to produce sentences from the link grammar parse. [D]
- Ongoing work to get unsupervised learning of syntax to work effectively [U]
- Early experimentation with unsupervised learning of RelEx2Logic type rules [U]
- Use NLP comprehension pipeline to feed a reasonably large amount of bio-text (e.g. PubMed abstracts) into Atomspace, for use in reasoning [B]
- Refactor microplanner to use more Atomese internally , rather than just being Scheme code [D]
- As well as refining the use of all the learning tools via their application to various tasks, a key thrust hereis implementing and tuning synergy between MOSES and other tools (PLN, pattern mining, URE). MOSES is a very powerful learning algorithm, designed for OpenCog integration, but currently is being used largely separately from the rest of OpenCog. A main goal in this phase is to change that. [UBG]
- MOSES to guide the pattern miner (we'd seed MOSES with our greedy heuristics and have it learn better ones), probably also implemented within a probabilistic programming framework
- Feed results of PLN inference on MOSES models back into MOSES, thus closing the loop…
- Reimplement Reduct in Atomspace using URE, then integrate MOSES loop tightly with Atomspace
- Scalable, complex application of pattern mining to bio data [B]
- Scalable, complex application of pattern mining to text data (for unsupervised language learning) [U]
- Scalable, complex application of pattern mining to game-world data [G]
- Implement goal importance-driven attention allocation (including the RFE mechanism described in Engineering General Intelligence, or some equivalent) [DG]
- Heuristics for assignment of credit in the dialogue context [D]
- Heuristics for assignment of credit in the game-playing context [G]
- Extend Atomspace-based deep learning infrastructure to handle video processing, and test this on [D]
- Hand gesture recognition
- Facial expression and emotion recognition
- Initial experimentation with integration between PLN and deep learning, so inference can be used to help choose inputs fed into the neural nets accessed via GroundedSchemaNodes [D]
- Learning of word meanings via correlating words with visual observations [DG]
- Tune working memory buffers for effective operation in robot dialogue and game-playing contexts [DG]
- Optimize Atomspace queries required for associative episodic memory retrieval [DG]
- Complex multi-part action learning for Minecraft, combining planning-based reasoning and more generic PLN reasoning [G]
- Implement basic infrastructure for deep learning of movement actions (this is needed for controlling a robot body, which is a medium-term OpenCog goal, but not really for controlling a robot head or a Minecraft character)
- Support planning for goals with different timescales [DG]
- Use of PLN-based planner for discourse planning, taking into account the narrative structure of a discourse over different time-scales [D]
High Level Goals
Overall: In Phase 3 we will focus on making the Phase 2 functionalities reliable and scalable, and easily customizable … i.e. on “productization” of the Phase 2 functionalities. Also, leveraging Phase 1 & 2 research, we will carry out (inasmuch as possible) replacement of hard-coded or supervised-learned components of the system with components based on experiential or unsupervised learning.
As this phase is further out, it is anticipated that more tasks will accumulate for Phase 3 as Phases 1 and 2 progress.
- Have a reasonably thorough “Cognitive API” that spans a decent variety of application tasks that folks might want to do with OpenCog
- Have an OpenCog system that is scalable across thousands of machines, for the usage scenarios that are important for the D, L, B and G areas
- Community/Process: Have a solid process in place for providing help and guidance to application users, who want to use OpenCog tools in their applications (but are not necessarily AI developers themselves)
- Dialogue: A dialogue system that actively engages with its conversation partner, asking questions to gain more knowledge about topics of its interest, as well as to better understand its conversation partner’s intent. Dialogue that demonstrates grasp of simple aspects of “theory of mind.” Here we are aiming at dialogue with roughly the level of commonsensical understanding of a young human child.
- Unsupervised Learning: Toward the goal of removing hard-coded/supervised-learned components, here we will aggressively replaced hard-coded/supervised-learned components of the NLP comprehension system with corresponding components produced via unsupervised/experiential learning wherever possible. If some hard-coded/supervised-learned components remain, these will be taken as the subject of ongoing R&D. We will also insert unsupervised/experientially learned components into the PLN inference control infrastructure, and the vision processing infrastructure, wherever R&D results make this possible; while also continuing ongoing research in these directions.
- Bio: Hypothesis formation and evaluation based on combination of patterns from data and ontologies with complex patterns extracted from texts via the NLP system.
- Gaming: Creative interactive sandbox gameplay, in which a system plays interactively with people and displays understanding of their intentions and preferences (and communicates usefully about these).
System and Tools
- Create an OpenCog architecture that can achieve agent control via effectively leveraging thousands of machines in the cloud. This may occur via a somewhat specialized, perhaps even somewhat application-specific architecture – e.g. a pool of machines just for Pattern Mining; a pool for Distributed MOSES, a pool for language generation based on Atomspaces that are refreshed with new information only periodically; etc.
Community and Process
- Complete creation of an OpenCog MOOC, and do a “soft launch” …
- Establish relationships with one or more consulting firms who are certified by OpenCog Foundation to offer advice or hands-on help to companies or individuals who want to use OpenCog in their applications
- Tuning inference control for complex analogical reasoning [DBG]
- Using the Pattern Miner for learning of inference control macros as an improvement on mining Markov Chains of past inference sequences stored in the Atomspace. This would remove the need for hardcoding macros, though the practicalities are subtle. [DUBG]
- PLN for inference on complex relationships output from bio-NLP (integrated with Atoms resultant from MOSES models and importation of bio-ontologies) [B]
- Application of unsupervised learning code to grounded language data (based on game-world interaction, and robotic visual/conversational interaction) [U]
- Implementation of hooks via which semantic understanding biases syntactic parsing [DU]
- Assignment of credit based on applying the Pattern_Miner to an auxiliary Atomspace that stores information regarding the activities carried out and orchestrated in the main Atomspace [DUBG]
- Extend OpenCog-based deep learning infrastructure to audition [D]
- Primary initial experiments: auditory emotion recognition
- First experiments with phoneme recognition, as a prelude to later tackling speech-to-text
- Deeper experimentation w/ top-down feedback in which cognitive conclusions feed additional inputs to neural nets wrapped in GroundedSchemaNodes [D]
- Deep learning of actions for robot body control
- Theory of mind (a subset of the theory mind young children learn) via epistemic logic in PLN [DG]