Reference resolution (2009 Embodiment)

From OpenCog
Jump to: navigation, search

This page is an archived historical copy of reference resolution in the OpenCog Game-World Embodiment system, as implemented in 2009, and used in various virtual-world demos from 2009 through 2013. The code described here was removed from github in 2015, as a part of a large codebase cleanup effort.

Reference Resolution (Embodiment)

See also: Reference resolution, Anaphora resolution (Embodiment)

Introduction

The Environment world (in which the agent's body is inserted) is comprised of multiple objects. Each object belongs to a given class. The class determines what kind of entity the object is (ball, stick, bone, barrel, etc.). For instance, the world may have different balls, each of them with its specific properties. So, if the avatar says "ball" in one sentence, which real (physically present in the environment) one did he/she mean? This question is answered by the reference resolution process.

When Relex parses a sentence, the output of this step (atoms in OpenCog scheme format) is then sent by the proxy to the opc and arrives at PAI. Then, after PAI loads the incoming data into the AtomTable it starts the reference resolution process to identify which nodes (SemeNodes) each word (mainly nouns and pronouns) in the sentence refers to (if it does). So, if the avatar says "ball", this process will try to identify if the avatar meant the ball_99 or the ball_11, according to their properties and the complete parsed sentence.

Scenario

Suppose that there is a red ball inside the environment, identified by id id_ball99. So, there will be the following nodes inside the agent's AtomTable:

AccessoryNode "id_ball99"
SemeNode "id_ball99"
ConceptNode "red"

ReferenceLink
   AccessoryNode "id_ball99"
   SemeNode "id_ball99"

ReferenceLink
   SemeNode "id_ball99"
   WordNode "ball"

Now, suppose the owner has said the sentence "grab the red ball". A bunch of atoms representing the parsed sentence will be loaded into the AtomTable. Here are some WordInstanceNodes and its linguistics concepts, extracted from the Relex2Atoms output loaded into the agent's AtomTable:

WordInstanceNode "grab@a6460c2d-b5f8-4287-8882-028d12de42d2"
WordInstanceNode "the@5bc9cde1-36b2-4bcb-97e8-da323175e4f9"
WordInstanceNode "red@216e8536-4867-49bc-970a-fc69608e39d2"
WordInstanceNode "ball@2227209d-89f8-444f-ab90-d749bbc71eda"

PartOfSpeechLink
   WordInstanceNode "grab@a6460c2d-b5f8-4287-8882-028d12de42d2"
   DefinedLinguisticConceptNode "verb"

PartOfSpeechLink
   WordInstanceNode "the@5bc9cde1-36b2-4bcb-97e8-da323175e4f9"
   DefinedLinguisticConceptNode "det"

PartOfSpeechLink
   WordInstanceNode "ball@2227209d-89f8-444f-ab90-d749bbc71eda"
   DefinedLinguisticConceptNode "noun"

PartOfSpeechLink
   WordInstanceNode "red@216e8536-4867-49bc-970a-fc69608e39d2"
   DefinedLinguisticConceptNode "adj"

ReferenceLink
   WordInstanceNode "red@216e8536-4867-49bc-970a-fc69608e39d2"
   WordNode "red"

ReferenceLink
   WordInstanceNode "ball@2227209d-89f8-444f-ab90-d749bbc71eda"
   WordNode "ball"

Reference resolution is a process executed to find a "ground" element to a term referred in a given sentence. A ground element, as mentioned before, is a real element present into the environment. Keeping that in mind, we can conclude that the sentence terms that matters at this point are nouns and pronouns. So lets see how to identify real things by using nouns and pronouns.

Resolving nouns

Each noun is mapped to its corresponding WordNode. In general, a WordNode represents a class of objects. The WordNode "ball", for example, represents every ball that can be mentioned in any sentence. On the other side, into the environment, a SemeNode is used to uniquely identify a real object, as shown in the first listing. A SemeNode has the same name of the real object node. Each object represented by SemeNodes belong to a class of objects. That class is the same referred by a WordNode. So, the connection between the spoken structures and the real world is a ReferenceLink between SemeNodes and WordNodes.

Each real object has a list of properties, like color, texture, etc. Each property is defined, into the AtomTable, by predicates like:

EvaluationLink
   PredicateNode "color"
      ListLink
         AccessoryNode "id_ball99" 
         ConceptNode "red"

These kind of properties are then used by the Reference Resolution rules to filter and select the best candidate to ground the WordInstanceNode. When a property is mentioned in a sentence, like "red ball", "small bear", etc., Relex converts it into Frames instances and add them to the final parsed sentence output. So, if it is said "Grab the red ball", the frame Color representing the word "red" is used to map the "ball" to the correct SemeNode, which is the one with the red color property associated to it.

A Reference Resolution Rule is defined by an ImplicationLink, which contains preconditions and effect. The preconditions basically evaluate the presence of specific frames instances in the latest parsed sentence. If the preconditions of an ImplicationLink is satisfied, it must return a list of SemeNodes that contain the characteristics that matches that rule. Lets see a real example:

BindLink
   ListLink
      TypedVariableLink
         VariableNode "$entityValue"
         VariableTypeNode "WordInstanceNode"        
      TypedVariableLink
         VariableNode "$colorValue"
         VariableTypeNode "WordInstanceNode"         
      TypedVariableLink
         VariableNode "$entityWordNode"
         VariableTypeNode "WordNode"         
      TypedVariableLink
         VariableNode "$colorWordNode"
         VariableTypeNode "WordNode"        
      TypedVariableLink
         VariableNode "$framePredicateNode"
         VariableTypeNode "PredicateNode"         
      TypedVariableLink
         VariableNode "$frameEntityPredicateNode"
         VariableTypeNode "PredicateNode"         
      TypedVariableLink
         VariableNode "$frameColorPredicateNode"
         VariableTypeNode "PredicateNode"
               
   AndLink
      InheritanceLink
         VariableNode "$framePredicateNode"
         DefinedFrameNode "#Color"            
      InheritanceLink
         VariableNode "$frameEntityPredicateNode"
         DefinedFrameElementNode "#Color:Entity"            
      InheritanceLink
         VariableNode "$frameColorPredicateNode"
         DefinedFrameElementNode "#Color:Color"            

      FrameElementLink
         VariableNode "$framePredicateNode"
         VariableNode "$frameEntityPredicateNode"            
      FrameElementLink
         VariableNode "$framePredicateNode"
         VariableNode "$frameColorPredicateNode"
            
      EvaluationLink
         VariableNode "$frameEntityPredicateNode"
         VariableNode "$entityValue"            
      EvaluationLink
         VariableNode "$frameColorPredicateNode"
         VariableNode "$colorValue"            

      ReferenceLink
         VariableNode "$entityValue"
         VariableNode "$entityWordNode"            
      ReferenceLink
         VariableNode "$colorValue"
         VariableNode "$colorWordNode"
            
      EvaluationLink
         GroundedPredicateNode "scm:inLatestSentence"
         ListLink
            VariableNode "$framePredicateNode"

   ExecutionLink
      GroundedSchemaNode "scm:filterByColor"
      ListLink
         VariableNode "$entityValue"
         VariableNode "$entityWordNode"
         VariableNode "$colorWordNode"

The rule represented by the ImplicationLink above is responsible for selecting every SemeNode that has the property color mentioned in the latest parsed sentence.

The sentence "grab the red ball" will produce in the Relex output a Frame instance like this:

InheritanceLink
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color"
   DefinedFrameNode "#Color"
InheritanceLink
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color_Color"
   DefinedFrameElementNode "#Color:Color"
InheritanceLink
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color_Entity"
   DefinedFrameElementNode "#Color:Entity"

EvaluationLink
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color_Color"
   WordInstanceNode "red@216e8536-4867-49bc-970a-fc69608e39d2"
EvaluationLink
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color_Entity"
   WordInstanceNode "ball@2227209d-89f8-444f-ab90-d749bbc71eda"

FrameElementLink
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color"
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color_Entity"
FrameElementLink
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color"
   PredicateNode "red@216e8536-4867-49bc-970a-fc69608e39d2_Color_Color"

So the ImplicationLink, which detects colors, will look for structures like the above in the AtomTable. You've probably noticed that the effect of the rule is an ExecutionLink. It, when executed, will call a function, referred by the GroundedSchemaNode (.i.e "scm:filterByColor" a procedure filterByColor written in Scheme), which receives as arguments the elements listed in the ListLink. The variables will be grounded before the execution of the procedure. This procedure must then return a list of chosen SemeNodes for each WordInstanceNode that matches the preconditions.

More then one rule can be used to filter the SemeNodes by the properties mentioned in the sentence. At the end of a WordInstanceNode resolution, just one SemeNode will be used to ground the term and a ReferenceLink will connect them:

ReferenceLink
   SemeNode "id_ball99"
   WordInstanceNode "ball@2227209d-89f8-444f-ab90-d749bbc71eda"

Now the sentence is prepared to other types of process, like Command Resolution, which will convert a sentence into actions.

Resolving pronouns

If the sentence contains a pronoun and there are more than one anaphoric candidates for it, this module must identify which one fits better according to the anaphoric score and also to the properties of the involved objects. Also, the command should be used to help identifying the correct noun. For instance, in the sentences: "Go to the tree", "Go to the ball" and "Grab it", although the anaphoric algorithm may indicate both tree and ball as candidates, it is impossible to "grab" the tree. So, the ball must be chosen as the preferred candidate.

The main difference between resolving nouns and pronouns is that for pronouns an additional step before retrieving the SemeNodes is executed. It doesn't makes sense following the WordNode of pronouns to find SemeNodes, so to resolve a pronoun we use the anaphoric suggestions to fetch one or more nouns that can be the one referred by the pronoun. Here is an example of anaphoric suggestion for pronoun 'it':

EvaluationLink (stv 1 0.5)
   ConceptNode "anaphoric reference"
   ListLink
      WordInstanceNode "it@d605f2e6-98bd-47fa-b3e1-5b7b9dbc570a" 
      WordInstanceNode "bone@c7c2a7a5-1d0b-485e-b789-ed6cbf58b162" 
   
EvaluationLink (stv 1 0.3333333333333333)
   ConceptNode "anaphoric reference"
   ListLink
      WordInstanceNode "it@d605f2e6-98bd-47fa-b3e1-5b7b9dbc570a" 
      WordInstanceNode "bone@f059078c-6b11-4bfc-a837-f3b050f9830c" 
   
EvaluationLink (stv 1 0.25)
   ConceptNode "anaphoric reference"
   ListLink
      WordInstanceNode "it@d605f2e6-98bd-47fa-b3e1-5b7b9dbc570a" 
      WordInstanceNode "bear@dbf88055-d39d-4a5e-9077-672ee384330e"   

EvaluationLink (stv 1 0.2)
   ConceptNode "anaphoric reference"
   ListLink
      WordInstanceNode "it@d605f2e6-98bd-47fa-b3e1-5b7b9dbc570a" 
      WordInstanceNode "stick@13ba07b1-50b3-42fd-9c2f-374c9b979f10" 
  
EvaluationLink (stv 1 0.2)
   ConceptNode "anaphoric reference"
   ListLink
      WordInstanceNode "it@d605f2e6-98bd-47fa-b3e1-5b7b9dbc570a" 
      WordInstanceNode "ball@2227209d-89f8-444f-ab90-d749bbc71eda"

Note that the (stv <mean> <confidence>) is the SimpleTruthValue of the suggestion and means the strength of that suggestion. This info is used select the noun which will be used to ground the pronoun.

Once retrieved each candidate noun to ground the pronoun, the process is the same to ground a noun, except by the fact that there are more options the the former. However, for each WordInstanceNode which represents the nouns its corresponding SemeNodes will be evaluated by the Rules and at the end just one SemeNode will ground the pronoun.

It is important to mention that after the Rules filter step and the anaphoric strength filter step, if more than one option still remains, a last filter is then applied to choose just one. This filter just compares the proximity between the element and the agent. The nearest will be chosen one.

Adding New Reference Resolution Rules

Each ImplicationLink, which describes a filter rule, can be found in the Scheme file opencog/embodiment/scm/reference-resolution-rules.scm . If you want to add a new one, write an ImplicationLink which returns after evaluated a list of WordInstanceNodes and the SemeNodes candidates like:

 [
   [ WordInstanceNode "win1", SemeNode "1a",..., SemeNode "1n"]
   .
   .
   .
   [ WordInstanceNode "win2", SemeNode "2a",..., SemeNode "2n"]
 ]

It can be done by defining an ExecutionLink surrounding a GroundedSchemaNode that calls a procedure that do that.

After defining the ImplicationLink, define a variable and point it to the Link Body. So, add that variable to a list named 'reference-resolution-rules' at the end of the file and this rule will be evaluated in the next the Resolution process. Before anything you must to calibrate the Relex rules to detect the property you will capture by the rule you're working on and check if it is sending Frames instances that represents the characteristics. The preconditions of your ImplicationLink must match the exactly shape of the Frames instances that Relex is sending to the agent.