NLP-PLN-NLGen pipeline

From OpenCog
Jump to: navigation, search

Obsolete Documentation. This page describes ideas and a plan for a subsystem that is no longer applicable to the current code base and the current design plans. It is being temporarily kept here, just in case there's some gold in here... After a review, this page should probably be deleted.

As the Brazilians are working on integrating NLP with PetBrain, and Linas is actively doing NLP stuff, my job is to get PLN doing something on NLP data. Then, once some inference has been carried out, possibly interacting with background knowledge, then these conclusions should be expressed via NLGen.

This is essentially implementing the idea Ideas#PLN_Inference_on_extracted_semantic_relationships.

The RelEx web tool might be of use while working out the output given from a particular sentence parse.

Concrete tasks

  • Construct a test scenario. Including background knowledge and the expected dialogue between the user and OpenCog. (1 weeks)
  • Set up the NLP pipeline, including the recent chatbot created by Linas and relevant dependencies. (0.5 weeks)
  • Create RFS Manager that will manage requests for certain relationships. (1 week)
  • Get PLN BackChainingAgent to fulfil these requests. (2 weeks)
  • Convert PLN BIT to also carry out forward chaining so that PLN can draw conclusions on NLP input without prompting from user to find certain relationships. (3 weeks)
  • Use NLGen to convert fulfilment of an RFS into text output. (2 weeks)

Goals

While the above are concrete programming and implementation tasks (which a liable to become more detailed as the plan progresses), this section contains the stages and goals of project.

Initial attempt

Just try to get PLN reasoning on simple syllogisms (don't worry about NLGen):

  1. Select a few example syllogisms from the next section, expressed in natural language, and shamelessly tweak the language so that RelEx and RelEx2Frame can handle them adequately (producing appropriate nodes and links as output).
  2. Add any extra background knowledge needed to make them work, on an ad-hoc basis
  3. Get PLN to do the syllogistic inference.

In the bigger picture, the common-sense background knowledge will have to come mainly from embodied experience and/or the system's parsing/interpretation of large volumes of text. This will require effective integration of attention allocation, and a lot of other things. But right now we just need to make sure all the machinery is working in an integrated way...

Example Syllogisms

Degree of difficulty: Some kinds of reasoning are harder than others. It is suggested that the following should be attempted:

  • is-a, has-a relations
  • part-of relations
  • next-to, inside-of, etc. spatial/location relations
  • at, during, before, after, etc. time/event relations
  • combinations of above
  • went-to, moved-to etc. physical movement relations
  • action, doing relations (talking, hitting, pushing, holding, mixing, etc)

This hierarchy requires increasingly sophisticated knowledge about the world: from simple is-a, has-a relations, to concepts of space-time, to "common-sense" knowledge (such as "what does it mean to talk/hit/push/hold/mix...).

All of the form,

Premise 1
...
Premise n
|-
Conclusion

Each syllogism may need some background knowledge, which will follow each example. Each syllogism also has it's own page which contains the output of the various NLP tools and projected inference.

is-a/has-a Relations

The classic Mortal Socrates:

Socrates is a man
Men are mortal
|-
Socrates is mortal

Hyperventilating Socrates:

Socrates is a man
Men breathe air
|-
Socrates breathes air

Bob's two feet:

Bob is a human.
Humans have two feet.
|-
Bob has two feet.
part-of relations

Axels and Unicycles:

Unicycles have one wheel.
A axel is part of a wheel.
|-
An axel is a part of Unicycles.
Spatial relations
Temporal relations
Spatial and temporal relations

Bob visits Jack Black:

On Tuesday, Bob went to Jack Black's house
Jack Black's house is in Topeka
|-
On Tuesday, Bob was in Topeka

If X goes to place Y, then after that, X is in place Y.


Multiple observation channels inference example:

Examples where

  • one premise comes from observation of the world
  • one premise comes from linguistic input

e.g.

Evaluation near (me, house)

comes from observation (the PetBrain gets this sort of stuff now, though I'm not sure what format it stores it in), and

Inheritance house building

which comes from the result of NLP and someone telling the system "Houses are buildings"... and then we ask the system

"Are you near a building?"

and it says

"Yes"

This kind of example is great to demo and talk about, because it shows use of reasoning to glue together language-gained knowledge and embodiment-gained knowledge...

Make inference more robust

The initial attempt allowed for the syllogism to be massaged into something that gave the results we want. The next step would be to explore the robustness of the NLP pipeline and inference on the output and ask the question "is it possible to address differences in the pipeline output through alternative inference paths?"

Turn inference results into natural language

The next step is to utilise NLGen to turn the conclusions from the syllogisms into natural language.

What this really involves is translating inference results in to RelEx style relationships. NLGen supposedly will do the rest.

Queries by natural language

Finally, we want to allow the user talking to OpenCog (through the chatbot interface) to ask questions and have OpenCog attempt to answer them. This differs from the above step, which is just forward chaining on the output of NLP.

(Note: since there is no forward chainer in PLN yet, it may be that prompting OpenCog for relationships is easier initially, therefore, this step may be swapped with the first.)