OpenCog HK 2014 Dialogue System Task Breakdown

From OpenCog
Jump to: navigation, search

OpenCog Dialogue System: High Level Task Breakdown and Draft Development Plan

First version created by Ben Goertzel, May 7 2014

Contents

Chat Bot Tasks

Task 1: Testing of OpenDial w/ speech to text, text to speech; initial integration of Hanson Robotics dialogue rules

OpenDial has already been integrated with open source speech to text and text to speech software, so this portion of it is presumably “just” a matter of getting existing OSS software to work and understanding its limitations.

The Hanson Robotics dialogue rules exist in a variety of formats, including AIML and others, and porting to them to OpenDial will require a combination of scripting and hand-coding.

DONE!

Current time estimate: 1 month
Tentatively assigned to: Man Hin 
Tentatively scheduled for: June

Task 2: Integration of more complex Hanson robotics dialogue rules

Some AIML and other Hanson Robotics chat rules are slightly complex in that they involve variables that are set at one point in the dialogue and used at some other point. These can also be integrated with OpenDial via a combination of scripting and hand-coding, but there will likely be a number of special cases to consider separately.

DONE!

Current time estimate: 1 month
Tentatively assigned to: Man Hin
Tentatively scheduled for: July

Task 3: Integration of MegaHal and OpenEphyra into OpenDial

The MegaHal system generates entertaining, quasi-random stochastic dialogue, based on combining phrases from a training corpus with pieces of what humans have said to it during conversation.

OpenEphyra is a question answering system that queries a search engine (currently Bing, or else a local corpus via the Indra software) and supplies answers based on this. Utilization of Bing is currently too slow for practical use, so if this cannot be sped up, it should be changed to work based on Indra instead, using an appropriate local knowledge store.

These two software systems have been integrated into an AIML-based dialogue system using the ProgramD chat software (in the OpenCog HK team in early 2014); the integration into OpenDial can use this as a guide.

DONE!

Current time estimate: 1 month
Tentatively assigned to: Man Hin
Tentatively scheduled for: Aug

Task 4: Tuning of OpenDial chat system

Chatting with the OpenDial system and seeing when it responds well and when it doesn’t, and adjusting various rules and parameters accordingly.

! waiting for feedback from Hanson Robotics !

Current time estimate: 1 month 
Tentatively assigned to: Melissa
Tentatively scheduled for: ??

Dialogue System tasks

Task 1: NLGen -- surface realization

This task involves making a real-time-usable, reasonably robust implementation of the “NLGen 1” algorithm for surface realization, a prototype of which was previously implemented by Ruiting Lian a few years ago.

What this NLGen system does is to translate a set of RelEx relationships into a set of link parser links corresponding to a grammatical sentence. Some statistical biasing may also be applied to cause the grammatical sentence to not be too unnatural in terms of the habitual patterns of human communication.

Tentatively assigned to: Amen
Current time estimate: N months, 1<N<3
Tentatively scheduled for:  Dec/Jan/Feb

Task 2: Script to translate Relex2Logic rules into ImplicationLinks (enabling execution via Atomspace-based rule engine)

Currently the RelEx2Logic rules exist in a mixed format: there are “main rules” executed via a custom Java rule engine embedded in a RelEx output generator, assisted by “helper functions” written in Scheme and interacting directly with the Atomspace.

We would like to eliminate the use of the custom Java rule engine and have each RelEx2Logic rule executed via triggering an ImplicationLink (wrapped in a BindLink) in the AtomSpace. This involves wrapping the helper functions in GroundedSchemaNodes, and writing a script to map the main RelEx2Logic rules into ImplicationLinks among Atoms. In principle this is straightforward but various fixes and tweaks to various AtomSpace related functions will likely be required along the way.

Tentatively assigned to: Matthew
Current time estimate: 1 month
Tentatively scheduled for: Dec

Task 3: Logic2RelEx -- write script to reverse Relex2Logic rules

Once the RelEx2Logic rules have been expressed as ImplicationLinks, it should be possible to automatically (or semi-automatically) reverse these rules to obtain Logic2Relex rules, that can be used in language generation. However, there may be significant subtlety here, in terms of the internal operation of some of the helper functions.

!obsoleted!

Dependencies: 2
Tentatively assigned to: ???
Current time estimate: 2 months
Tentatively scheduled for: July-Aug

Task 4: Rule engine to run (Atomspace-based versions of) Relex2Logic, Logic2RelEx rules, using pattern matcher

A conceptual design for a “general OpenCog rule engine” operating on rules expressed as ImplicationLinks in the AtomSpace has been articulated. This task involves building the core of such a rule engine (without all ultimately desired functions necessarily being implemented at this stage), and utilizing it for execution of RelEx2Logic and Logic2RelEx rules.

For initial testing, a few RelEx2Logic and Logic2RelEx rules may be created in ImplicationLink format by hand.

In principle this should be straightforward -- but various fixes and tweaks to various AtomSpace related functions will likely be required along the way.

ALMOST DONE! (well, still needs testing)... though actually PLN, not RelEx2Logic, was used for initial testing...

Tentatively assigned to: Misgana (lead developer), Cosmo 
Current time estimate: 2 months
Tentatively scheduled for: maybe done in December? 

Task 5: Testing / tweaking of surface realization (from logic to English )

Tasks 1-4 involve developing a complete pipeline that takes a sentence-sized collection of logical Atoms, and turns it into English.

This task involves developing a set of test cases for the whole logic-to-English pipeline, and evaluating the system on these test cases, and fixing the problems that will inevitably arise in this process.

! obsoleted due to redundancy!

Dependencies: 1,3,4
Tentatively assigned to: nobody
Current time estimate: 0
Tentatively scheduled for: never

Task 6: Word/phrase selection

Given a set of Atoms to be articulated as a series of sentences, the problem arises that some of these Atoms may not be directly linked to any WordNodes, so that there may not be any obvious way to articulate them as words.

Thus, the problem arises: for each Atom chosen for articulation, find a reasonably accurate way to represent it as a {ConceptNode or PredicateNode linked to a WordNode via a ReferenceLink}, or as a small combination of such Nodes.

In the initial case (which is all that this task covers), we will just handle the case of searching the Atomspace for finding the best representation, using simple heuristics. A later task Task 8) will involve using PLN to infer word-based representations of concepts.

Dependencies: 22 (discourse model)
Tentatively assigned to: William
Current time estimate: 2 months 
Tentatively scheduled for: the future (after microplanning is "done") 

Task 7: Microplanning for language generation (select sub graphs of RelEx graph to turn into sentences, insert pronouns ... )

Suppose one has a set S of (say, 10-200) Atoms intended to be used as the basis for generating a series of one or more related sentences. NLGen, which handles surface realization, operates on a “sentence sized chunk” of logical Atoms and translates it into an English sentence. Thus there arises the task of taking the Atom-set S and iteratively turning it into a series of sentence-sized chunks. This is an instance of what is called “microplanning” in the NL generation literature.

The basic idea for an initial version of the microplanner is described at the page Microplanner

! basic structure is done, still more work is needed ... (Nov 2014)

Tentatively assigned to: William
Current time estimate: ...
Tentatively scheduled for: maybe finished Dec/Jan?

Task 8: Implement PLN-based word selection

Sometimes the Atomspace doesn’t transparently contain any word-based representation of an Atom chosen for articulation. Then we need to do some inference to figure out how to express the Atom in words. This can be done via customizing the PLN backward chainer.

Tentatively assigned to: William
Current time estimate: 2.5 months 
Tentatively scheduled for: later

Task 9: Design, implement conversational goals

The OpenCog-based dialogue system will be goal-driven: statements will be made based on the execution of “speech act procedures” that are selected via the OpenPsi motivation system based on its estimation that such execution will help achieve system goals.

But what system goals?

In principle a few very high-level goals (e.g. seek information, seek novelty, please others) should be enough for an AI system, and other useful goals can be derived as subgoals of these. For the immediate term, however, it will be useful to explicitly encode lower-level goals.

Example lower-level goals appropriate for general dialogue are, e.g. (not a complete list):

  • When engaged in a dialogue, fill up silence
  • Answer when asked
  • Utter important information
  • Get approval from one’s dialogue partner
  • Gain new information/knowledge

Additional specialized goals may be appropriate for implementation of dialogue in a game-character or robotics context.

Tentatively assigned to:  Rodas
Current time estimate: 2 months 
Tentatively scheduled for: Feb-March

Task 10: Implement fuzzy subhypergraph matching

Since the same (or very similar) information may be expressed in natural language in many different ways, question-answering must be able to handle the case where a NL query as entered by a dialogue partner, and the commonsensical answer to this query as “read” by OpenCog from some text source, have different syntactic expressions.

Ideally this would be dealt with via PLN or other reasoning that fully “understands” the content of the query and answer sentences. However, as a temporary means – and as a long-term mechanism for handling cases where the language involves is not fully understood – it will be useful for having a tool to handle query/answer matching based on “fuzzy Atom-subhypergraph matching.” Such a tool will also be useful more broadly for other applications requiring fuzzy matching of Atomspace hypergraphs.

A tool of this nature was implemented in the Novamente Cognition Engine in 2004-2005, using a Dynamic Programming based algorithm, but this is not necessarily the ideal algorithm to use going forward.

So what we want is a function of the conceptual form

float findApproximateMatches(Atom_subgraph A)

That is reasonably fast for Atom subgraphs with dozens of Atoms, plus a reasonably efficient algorithm that, given A, searches the Atomspace and finds B that are close matches to A. The process has to be fast enough to be used real-time within natural language dialogue, i.e. less than half a second or so.

An outline of one possible algorithm for this is given in: Approximate Pattern Matching

Current time estimate: 2 months
Tentatively scheduled for: Dec-Jan
Tentatively assigned to: Man Hin

Task 11: Design, implement speech act and conversation-management schema (and link to goals)

In the initial version of the OpenCog dialogue system, utterances will be triggered by specific “speech act schema” that are hard-coded (via ImplicationLinks) to specific Goal Atoms. (The truth values of these links may then be adapted based on the system’s experience.)

Simple examples of such schema are:

  • Identify when a question has been asked, and answer it
  • Articulate the most important Atoms in the Atomspace
  • Argue/question, when the dialogue partner says something that seems incorrect

A longer list of such schema can be obtained from the speech act literature, especially from the DAMSL paper (the paper by Twitchell and Nunamaker titled “Speech Act Profiling: A Probabilistic Method for Analyzing Persistent Conversations and Their Participants”, which contains a fine-grained ontology of 42 kinds of speech acts, called SWBD-DAMSL (DAMSL = Dialogue Act Markup in Several Layers). We don’t need to implement all 42 of these, and we can add some others, but this paper gives a useful practical guide as it is based on analyzing actual human dialogues in detail using speech act theory as a basis.)

Example of conversation-management schema would be: starting a conversation, saying "Uh huh", etc.

Tentatively assigned to: Rodas, Man Hin
Current time estimate: many months 
Tentatively scheduled for: March - ??

Task 12: Implement useful question answering

Given a basic question-answering schema, and a process for fuzzy hypergraph matching, these two can be connected, creating a question-answering schema that utilizes fuzzy hypergraph matching to answer questions. In principle this should yield a powerful QA system (if the Atomspace is filled with appropriate knowledge ingested via the comprehension pipeline), but it will doubtless require lots of adjustment, tweaking and fixing before it works OK.

Dependencies: 10, (part of) 11
Current time estimate: 1.5 months
Tentatively scheduled for: March - ??
Tentatively assigned to: Man Hin

Task 13: Import of opendial rules into opencog based system

The OpenDial chatbot, utilizing a combination of hand-coded rules and calls to external systems, should by September display a reasonably powerful chat functionality. This functionality can be ported into the OpenCog dialogue system, giving the OpenCog dialogue system the ability to conduct reasonably interesting conversations.

A script can be written to translate OpenDial rules (which are expressed as implications) into OpenCog ImplicationLink; and it should be straightforward to translate OpenDial references to external sources (like MegaHal and OpenEphyra) into GroundedSchemaNodes referenceable within OpenCog ImplicationLinks. However, some thinking will need to be done regarding how to break up the OpenDial functionality among speech act schema. Some of the OpenDial functionality may go into new speech act schema (e.g. a “say something amusing” schema wrapping up the MegaHal reference), and some may be wrapped into existing speech act schema (e.g. the OpenEphyra reference could be invoked by the general question-answering schema, to be utilized when no sufficiently good answer to the question at hand is found within the Atomspace directly). The AIML-type rules used within OpenDial may end up being divided among various existing speech act schema (e.g. OpenDial rules for greetings could be placed within a general greetings schema).

OpenDial handling of dialogue variables (e.g. the user’s name) will have to be mapped into OpenCog structures carefully and probably on a case-by-case basis.

Tentatively assigned to: Man Hin 
Current time estimate: 1 month
Tentatively scheduled for: sometime in early 2015, when it seems useful

Task 14: Testing of generation from general subhypergraph to English

Putting the {sentence sized chunk of Atoms ---> English sentence) functionality together with word selection and microplanning, one obtains a language generation functionality capable of mapping a general Atom-set into a series of related English sentences.

This task consists of making a set of test cases for this overall functionality, and testing the system accordingly, and fixing the obvious things that are found to be broken.

Dependencies: 5,6,7
Tentatively assigned to: Amen, Rodas, Alex (and possibly some new interns)
Current time estimate: 2 months
Tentatively scheduled for: Sep-Oct 

Task 15: Testing of dialogue system

Putting the “generation from general subhypergraph to English” functionality together with the system of speech acts and associated goals, plus the content imported from the OpenDial chatbot, we obtain an overall OpenCog dialogue system.

This task consists of making a set of test cases for this overall functionality, and testing the system accordingly, and fixing the obvious things that are found to be broken.

Dependencies: 8,11,12,13,14
Tentatively assigned to: Amen, rodas , alex
Current time estimate: 2 months 
Tentatively scheduled for: Nov-Dec

Task 16: Integration of dialogue system w/ robot

By the time this task is undertaken, an OpenDial-based chat system will already be integrated with a robot within an OpenCog-based robotics framework. Substituting the OpenCog-based dialogue system in place of the OpenDial-based system should not be extremely difficult. Tweaking the various speech act schema and associated hard-coded dialogue rules for appropriateness in a robotics context may need some effort.

Subtle issues here would seem to be:

  • Goals specific to the game world context, and their connection with speech acts
  • Articulation of descriptions of things in the game world environment
  • Articulation of events, processes and sequences in the game world
Dependencies: 13
Tentatively assigned to: Mandeep & Man Hin with help from Misgana
Current time estimate: 3 months
Tentatively scheduled for: Oct-Ded

Task 17: Integration w/ game character control

By the time this task is undertaken, an OpenDial-based chat system will already be integrated with a video game agent within the Embodiment framework connected to Unity3D. Substituting the OpenCog-based dialogue system in place of the OpenDial-based system should not be extremely difficult. Tweaking the various speech act schema and associated hard-coded dialogue rules for appropriateness in the game context may need some effort.

Subtle issues here would seem to be:

  • Goals specific to the game world context, and their connection with speech acts
  • Articulation of descriptions of things in the game world environment
  • Articulation of events, processes and sequences in the game world
Dependencies:13
Tentatively assigned to: Lake
Current time estimate: 3 months
Tentatively scheduled for: Oct-dec

Task 18: Testing robotic dialogue

Making a set of test cases for robotic dialogue, doing testing, and fixing what’s broken.

Dependencies:14
Tentatively assigned to: Mandeep
Current time estimate: 2 months part time
Tentatively scheduled for: Jan-Feb

Task 19: Testing game character dialogue

Making a set of test cases for game character dialogue, doing testing, and fixing what’s broken.

Dependencies: 15
Tentatively assigned to: Lake
Current time estimate: 2 months part time
Tentatively scheduled for: Jan-Feb

Task 20: Experimentation with Unsupervised Learning of Link Grammar Rules

Initial experimentation with learning link grammar link types and dictionary entries from an unlabeled corpus, according to the general ideas outlined in http://arxiv.org/abs/1401.3372

Experimentation should include learning "from scratch", and may also include learning based on taking parts of the existing link grammar dictionary as an initial condition.

Results of this initial experimentation will determine next steps for unsupervised learning in the OpenCog NLP pipeline (potentially involving unsupervised learning of RelEx and RelEx2Logic rules, for example)

Dependencies: none
Tentatively assigned to: Aaron & Matthew, with support from Ruiting & Amen
Current time estimate: 5 months
Tentatively scheduled for: Jan-May

Task 21: Getting some backing store working effectively

This is not really an NLP task, but is listed here due to its importance for Task 20 (unsupervised learning).

For unsupervised learning it will be very valuable to be able to rapidly load (small or large) pieces of a saved AtomSpace according to a flexible set of queries.

This could be done by extending the Postgres Backing Store's API, or by integrating some other backing store and writing an appropriate API for it and customizing it appropriately.

Dependencies: none
Tentatively assigned to: Matthew, with support from Amen and Linas
Current time estimate: 4 months part time
Tentatively scheduled for: Jan-April

Task 22: Create OK model of discourse context

What is the (implicit) context that a discourse is relevant to? This includes the conversation history, plus stuff in the immediate environment being talked about (e.g. the common "physical" environment, in a game world or robotics context), plus "obviously" relevant stuff pulled up from memory. (Ultimately this requires theory-of-mind modeling, but simple cases probably don't need this..)

Information-theoretic criteria of surprisingness may be applied relative to the discourse context, to aid in word/phrase selection and other tasks.

Dependencies: none
Tentatively assigned to: William? Man Hin?  ??
Current time estimate: 2 months, 3 days, 7 hours, 1 minute, 4 seconds
Tentatively scheduled for: later in 2015