EmbodimentLanguageComprehension Frames2Relex

From OpenCog
Jump to: navigation, search

(under development)

This text describes the initial idea of the Frames2Relex module, which is the reverse of Relex2Frames. Once the NLGen requires Relex format input and the embodiment deals with frames format, it is required that the embodiment frames be transformed into Relex format before being used by NLGen. We are considering two approaches: 1) use the same Relex2Frames rules to do the reverse work; 2) create new rules specific to our interest. We have adopted the approach 2, but this document also describes the first one for documentation propose. The two approaches are described below:

Approach 1: Use the reverse of Relex2Frames rules

The frames output is generated by IF-THEN rules (Relex2Frames rules) described in the file relex/data/frame/mapping_rules.txt. So, for instance, if we have the Relex output:

_subj(grab,ball), 

then the rule

IF _subj($Manipulation,$var0) THEN ^1_Manipulation:Agent($Manipulation,$var0)

originates the frame

^1_Manipulation:Agent(grab,ball).

Basically, considering the example above, the reverse process (Frames2Relex) must convert the frame

^1_Manipulation:Agent(grab,ball)

into

_subj(grab,ball)

which is the relex original output.

However, there are more than one rule that may generate the same frame. For instance, the rule

IF $Imperative_relation($Manipulation) ^ _to-do($var0,$Manipulation) THEN ^1_Manipulation:Agent($Manipulation,$var0)

also creates the ^1_Manipulation:Agent(grab,ball) frame. The problem is that it is not possible to identify which rule was the one that originates the frame, and if we accept both rules, the NLGen may not be able to understand correctly the sentence.

Another issue is that not all Relex output are mapped by rules that generate frames. So, the Relex output (which will be the NLGen input) may be incomplete and the sentence may not be recognized either.

These two issues must be take into account and in case they really impact results, we have to figure out how to solve them.

Let's then consider an example to see how the whole process will work. Consider the sentence "The ball is blue.", which may be an answer for the question "What is the color of the ball?"

The frames output for this sentences are:

Color:Entity(ball,ball)
Color:Color(blue,blue)
Temporal_colocation:Event(blue,blue)
Temporal_colocation:Time(present,present)
Attributes:Entity(blue,ball)
Attributes:Attribute(blue,blue)

According to the Relex2Frames reverse rules, the relex output would be something like (for each frame):

Frames: 
Color:Entity(ball,ball)
Color:Color(blue,blue)

Possible Rules for the frames above:
IF _psubj(be,$var0) ^ _obj(be,$Color) THEN ^1_Color:Color($Color,$Color) ^1_Color:Entity($var0,$var0)
IF _subj(be,$var0) ^ _obj(be,$Color) THEN ^1_Color:Color($Color,$Color) ^1_Color:Entity($var0,$var0)
IF _predadj($var0,$Color) THEN ^1_Color:Color($Color,$Color) ^1_Color:Entity($var0,$var0)

Relex output from the frames above:
_psubj(be,ball)
_obj(be,blue)
_subj(be,ball)
_predadj(ball,blue)
Frames: 
Temporal_colocation:Event(blue,blue)
Temporal_colocation:Time(present,present)

Possible Rules for the frames above:
IF present($var0) THEN ^1_Temporal_colocation:Time(present,present) ^1_Temporal_colocation:Event($var0,$var0)

Relex for the frames above:
present(blue)
Frames:
Attributes:Entity(blue,ball)
Attributes:Attribute(blue,blue)

Possible Rules for the frames above::
IF _predadj($var1,$var0) ^ NOT $var0=$Locative_relation THEN ^1_Attributes:Attribute($var0,$var0) ^1_Attributes:Entity($var0,$var1)

Relex for the frames above::
_predadj(ball, blue)

Then, the complete relex output would be:

_psubj(be,ball)
_obj(be,blue)
_subj(be,ball)
_predadj(ball,blue)
present(blue)

To evaluate if the NLGen will recognize this input, the sentence "The ball is blue" was added to the NLGen corpus. The semantic and syntax data were created first, and the Relex output found below was used as input. However, the NLGen didn't recognize the sentence, which may be because the issues mentioned before.

Another important issue on this approach is the effort we will have in order to implement the reverse rules according to the already defined ones.

Approach 2: create new rules

The other approach is to create new rules in embodiment according to the known frames. In the following we show the frames we want consider in language generation and how this should be done:

Physiological needs and feelings

There are two options for expressing biological urges and feelings. The first one is to say: "I want to.." and the second one is "I am...". So, you can say "I want to eat" and "I am hungry". Let's see how to generate both types of sentences.

The frames and their elements are:

#Biological_urge:Experiencer => Fido
#Biological_urge:Expressor => Butt (Ignored)
#Gradable_attributes:Attribute => {Poo,Pee,eat,drink} as ATTR
#Gradable_attributes:Degree => High
#Gradable_attributes:Value =>  > 0.5 (threshold)

The sentences that must be created are:

I want to poo.
I want to pee.
I want to eat.

The Relex output that must be generated is:

_subj(ATTR, I)
imperative(ATTR)
hyp(ATTR)
.v(ATTR)
verb(ATTR)
_to-do(want, ATTR)
_subj(want, I)
present(want)
.v(want)
verb(want)
.r(to)
definite(I)
person(I)
pronoun(I)
.p(I)
noun(I)
singular(I)
punctuation(.)

Now, let's see how the second type of sentence may be generated:

#Biological_urge:Experiencer => {Fido, I(me), Sally} as ATTR1
#Biological_urge:Expressor => Butt (Ignore)
#Gradable_attributes:Attribute => {Hunger,Thirsty, Angry} as ATTR2
#Gradable_attributes:Degree => High
#Gradable_attributes:Value =>  > 0.5

The sentences that must be created are:

I am hungry.
I am thirsty.
I am angry.

The Relex output that must be generated is:

present(ATTR2)
.a(ATTR2)
adj(ATTR2)
_predadj(ATTR1, ATTR2)
definite(ATTR1)
person(ATTR1)
pronoun(ATTR1)
.p(ATTR1)
noun(ATTR1)
singular(ATTR1)
.v(be)
verb(be)
punctuation(.)

For sentences related to third person, it is a bit different

Sally is hungry.
Sally is angry.

The Relex output that must be generated is:

present(ATTR2)
.a(ATTR2)
adj(ATTR2)
_predadj(ATTR1, ATTR2)
definite(ATTR1)
person(ATTR1)
noun(ATTR1)
singular(ATTR1)
.v(be)
verb(be)
punctuation(.)

Holding

This kind of sentence is used to express who/what is holding what, like: I am holding a ball.

The frames considered are:

#Manipulation:Agent => {Fido,Sally} as ATTR_1
#Manipulation:Event => Holds
#Manipulation:Entity => {ball, etc} as ATTR_2
#Manipulation:Time => X (the initial time)
#Manipulation:Duration => Y

Where NOW - X < Y minutes, which indicates that the agent is still holding the entity.

The sentences that must be created are:

I am holding the ball.
I am holding a ball.
Sally is holding the ball.
Sally is holding a ball.

The Relex output that must be generated is:

_obj(hold, ATTR_2)
_subj(hold, ATTR_1)
present_progressive(hold)
.v(hold)
verb(hold)
punctuation(.)
definite(ATTR_2)
.n(ATTR_2)
noun(ATTR_2)
singular(ATTR_2)
.v(be)
verb(be)
definite(ATTR_1)
person(ATTR_1)
pronoun(ATTR_1)
.p(ATTR_1)
noun(ATTR_1)
singular(ATTR_1)
det(the) OR det(a)

Locative Relation

This kind of sentence is used to express the locative relation between two objects or person.

The frames considered are:

Locative_relation:Figure => {ball,etc} as ATTR_1
Locative_relation:Ground => {barrel,etc} as ATTR_2
Locative_relation:Relation_type => {near, next_to, in_front_of} as ATTR_3

The sentences that must be created are:

The ball is near the barrel.
The ball is next to the barrel.
The ball is in front of the barrel.

The Relex output that must be generated is:

ATTR_3(be, ATTR_2)
_subj(be, ATTR_1)
_obj(near, ATTR_2)
present(be)
verb(be)
punctuation(.)
definite(ATTR_2)
.n(ATTR_2)
noun(ATTR_2)
singular(ATTR_2)
_obj(near, ATTR_2)
noun(near)
det(the)
definite(ATTR_1)
.n(ATTR_1)
noun(ATTR_1)
singular(ATTR_1)
det(the)


Moving

This kind of sentence is used to express the movement of objects or person in some direction.

The frames considered are:

#Motion_directional:Theme => {ball,etc} as ATTR_1
#Motion_directional:Direction => Vector x,y,z (ignored)
#Motion_directional:Goal => {barrel, me, etc} as ATTR_2
#Motion_directional:Path (ignored)
#Motion_directional:Source (ignored)

The sentences that must be created are:

The ball is moving in my direction.
The ball is moving in Sally's direction.

The Relex output that must be generated is:

_obj(be, moving)
_subj(be, ATTR_1)
present(be)
.v(be)
verb(be)
prep(in)
definite(ATTR_2)
person(ATTR_2)
pronoun(ATTR_2)
adj(ATTR_2)
singular(ATTR_2)
_poss(direction, ATTR_2)
definite(direction)
.n(direction)
noun(direction)
punctuation(.)
in(moving, direction)
.g(moving)
noun(moving)
uncountable(moving)
definite(ATTR_1)
.n(ATTR_1)
noun(ATTR_1)
singular(ATTR_1)
det(the)

Color

This kind of sentence is used to express the color of objects. This information may also be obtained from the AccessoryNode of the color.

The frames considered are:

Color:Entity => {ball,etc} as ATTR_1
Color:Color => {blue,red,etc} as ATTR_2

The sentences that must be created are:

The ball is blue.
The bear is brown.

The Relex output that must be generated is:

present(ATTR_2)
.a(ATTR_2)
adj(ATTR_2)
_predadj(ATTR_1, ATTR_2)
definite(ATTR_1)
.n(ATTR_1)
noun(ATTR_1)
singular(ATTR_1)
.v(be)
verb(be)
punctuation(.)
det(the)

Action

This kind of sentence is used to answer questions like: "What are you doing?" or "What is Sally doing?".

The frames considered are:

#Intentionally_act:Act => {dancing,eating,etc} as ATTR_1
#Intentionally_act:Agent {I,Sally} as ATTR_2

The sentences that must be created are:

I am dancing.
I am eating.

The Relex output that must be generated is:

_obj(be, ATTR_1)
_subj(be, ATTR_2)
present(be)
.v(be)
verb(be)
punctuation(.)
.n(ATTR_1)
noun(ATTR_1)
uncountable(ATTR_1)
definite(ATTR_2)
person(ATTR_2)
noun(ATTR_2)
singular(ATTR_2)
pronoun(I)
.p(I)

For sentences related to third person, it is a bit different:

Sally is dancing.
Sally is eating.

The Relex output that must be generated is:

_obj(be, ATTR_1)
_subj(be, ATTR_2)
present(be)
.v(be)
verb(be)
punctuation(.)
.g(ATTR_1)
noun(ATTR_1)
uncountable(ATTR_1)
definite(ATTR_2)
person(ATTR_2)
noun(ATTR_2)
singular(ATTR_2)

Yes/No answer

The reasoner will decide according to the TV of the predicate asked if the answer should be Yes or No. That is, NLGEn is not required for these simple answers.


Frames2Relex Rules

Besides defining which relex output should be generated from a specific list of frames, we have also defined how these rules will be stored and used. Each rule may be represented as: List<Frames> => relex_output.


NLGen Integration

After the frames2relex module In order to generate the sentence that represents the answer, the following

Preliminary NLGen corpus related to MV

The following sentences will be used to "train" the NLGen so it will be able to generate the sentences expected in the MV world. The good news is that NLGen recognizes similar sentences. So, if we add the sentence "The ball is blue" to the corpus, it will recognize any similar sentence like "The ball is red", "The bear is brown", "The stick is white" and so on...

The ball is blue.
The ball is blue and red.
The ball is near the fountain.
I am hungry.
Sally is hungry.
I am dancing.
Sally is dancing.
The ball is moving in my direction.
The ball is moving in Sally's direction.
I am holding the ball.
I am holding a ball.
Sally is holding the ball.
Sally is holding a ball.
Yes.
No.