Dependency relations

From OpenCog
(Redirected from Dependency relationship)
Jump to: navigation, search

RelEx generates dependency relations (also know as binary relations) that connect pairs of words or phrases, and name the relationship between these parts. Thus, for example, int the sentence "John threw the ball", "John" is the subject who is doing the throwing, and "the ball" is the object being thrown. This is denoted as

  _subj (throw, John)
  _obj (throw, ball)

The ordering of the dependency is such that the head is listed first (in the example, the head is throw), and the dependent is listed second. These dependency relations come in two forms: a fixed number of predefined relations, and a much larger number (hundreds) of prepositional relations.

This page also reviews some of the ways in which RelEx differs from two other popular dependency parsers: MiniPar and the Stanford Parser. As a general idea, RelEx attempts a greater degree of semantic abstraction; that is, it is less aimed at presenting the syntactic structure of sentence, than it is in presenting its semantic content. This shows itself in several ways:

  • RelEx attempts basic entity extraction, and thus avoids generating nn noun modifier relations for named entities.
  • RelEx will collapse the object and complement of a preposition into one. Stanford will do this for some, but not all relationships.
  • RelEx will convert passive subjects into objects, and instead indicate passiveness by tagging the verb with a passive tense feature.
  • RelEx avoids generating copulas, if at all possible, and instead indicates copular relations as predicative adjectives, or in other ways.
  • RelEx extracts semantic variables from questions, with the intent of simplifying question answering. For example, Where is the ball? generates _pobj(_%atLocation, _$qVar) _psubj(_%atLocation, ball), which can then pattern-match a plausible answer: _pobj(under, couch).
  • RelEx attempts to extract comparison variables.

Table of dependencies

The fixed, named dependency relations that occur in statements are given in the table below. Some additional relations are generated for questions, these are described in the next section. This table is meant to be authoritative.

RelEx relation Formal description Example Relation Example Sentence
_amod attributive adjectival modifier _amod(door, locked) The locked door fell open.
_advmod adverbial modifier _advmod(run, quickly) Jim runs quickly.
_appo appositive of a noun (appositional modifier) _appo(bird, robin) The bird, a robin, sang sweetly.
_comp complement _comp(that, fly) I think that dogs can fly.
_comparative comparative (handle more/less relationships) _comparative(quickly, run) He runs more/less quickly than John.
_date_day day of month _date_day(December, 3rd) It happened on December 3rd, 1990.
_date_year year modifier _date_year(December, 1990) It happened on December 3rd, 1990.
_expl expletive/filler subject _expl(be, there) There is a place we can go.
_iobj indirect object _iobj(give, quarterback) The linebacker gave the quarterback a push.
_measure_distance unit of distance measure _measure_distance(away, foot) The boy is 4 feet away.
_measure_per unit of repetition. _measure_per(times, day) Take these 4 times a day.
_measure_size unit of size measure _measure_size(tall, feet) The boy is 4 feet tall.
_measure_time unit of time measure. _measure_time(old, years) The birthday boy is 12 years old
_modal modal verb _modal(can, fly) I think that dogs can fly.
_nn prenomial modifier of a noun _nn(line, goal) He stood at the goal line.
_obj direct object, also used for passive nominal subject. _obj(eat, sandwich) He ate the sandwich.
_parataxis parataxis _parataxis(leave, say) The guy, John said, left early in the morning.
_pobj object of preposition _pobj(next_to, house) The garage is next to the house.
_poss possessive or genitive modifier of a noun (gen) _poss(hand, John) John's hand slipped.
_predadj predicative adjectival modifier _predadj(Smith, late) Mr. Smith is late.
_predet predeterminer _predet(boy, all) All the boys knew it.
_psubj subject of preposotion _psubj(next_to, garage) The garage is next to the house.
_quantity numeric modifier _quantity(dollar, three) He lost three dollars.
_quantity_mod quantity modifier _quantity_mod(three, almost) He lost almost three dollars.
_quantity_mult quantity multiplier _quantity_mult(hundred, three) He lost three hundred dollars.
_rep representation (claim, statement, thought) _rep(think, that) I think that dogs can fly.
_repq representational question _repq(ask, how) Tom asked me how to do it.
_subj subject of a verb _subj(be, he) _subj(do, one) He is the one that did it.
_that null/implied preposition "that" _that(know, angry) He knew I was angry.
_to-be adjectival complement (acomp) _to-be(smell, sweet) The rose smelled sweet.
_to-do clausal complement (ccomp/xcomp) _to-do(like, row) Linas likes to row.

Any given relation may occur multiple times in one sentence, including _subj, _obj; for example "He is the one that did it." -- the object of "he" is the subject of "do".

Parsing questions

The following dependencies are generated during the analysis of questions. These are meant to identify the variables in a question. Thus, for example, consider the question-answer pair: Where is the ball? The ball is under the couch. The question generates

_%atLocation(_%copula, _$qVar)
_subj(_%copula, ball)

while the corresponding statement generates:

under(be, couch)
_subj(be, ball)

Treating the above as a pattern matching problem, one can see that the question is answered by grounding the query variables as follows:

_%atLocation -> under
_%copula -> be
_$qVar -> couch 


See also additional discussion at query variables and comparison variables.

RelEx relation Formal description Example Relation Example Question Response Pattern
_%atLocation location query _%atLocation(happen, _$qVar) Where did that happen? It happened in the zoo.
_%atTime time query _%atTime(happen, _$qVar) When did that happen? It happened at 5 o'clock.
_%because causative query _%because(fall, _$qVar) Why did he fall? He fell because he tripped.
_%how method query _%how(do, _$qVar) How should I do it? Do it like this.
_%howdeg how-of-degree query How quickly does he run?

Comparisons to other parsers

Other dependency grammars, such as Dekang Lin's MiniPar or the Stanford parser, generate similar relations. In the case of the Stanford parser, RelEx can be placed into a "compatibility mode", wherein it generates exactly the same output. This is useful both for testing, and because the RelEx parser is more than three times faster. However, the "native" RelEx output differs from the Stanford parser in important ways.

The notes below document some of the differences between RelEx and these other systems, and provides some rationale for these differences. As a general rule, the Stanford parser attempts to be more "syntactically precise", reflecting syntactic relations in it's dependencies, whereas RelEx tries to be more "semantic", in that it tries to normalize different syntactic constructions to obtain the same semantic forms. The goal of this greater semantic normalization is to provide a more reliable/simpler substrate for semantic reasoning and question-answering.

To perform this "semantic normalization" without loosing important information, RelEx makes extensive use of feature tags, which the Stanford parser does not generate.

Within the context of "Meaning-Text Theory", advanced by Igor Mel'čuk, et al, one would say that the output of the Stanford parser is strictly limited to the SSyntR or surface syntactic representation of a sentence, while RelEx, in general, attempts to come closer to the DSyntR, or deep syntactic representation of a sentence (see ref below).

These differences are not set in stone: RelEx is malleable; and in some cases, RelEx is surely buggy. These are "working notes" rather than a "finished work".

Entity extraction

RelEx attempts basic entity extraction, and generates polyword phrases in its output. For example, "I live in New York" parses as:

_subj(live, I)
in(live, New_York)

whereas the Stanford parser generates the more syntactically literal nn dependencies:

nsubj(live, I)
nn(York, New)
prep_in(live, York)

Predicative adjectives, copulas

Note that attributive adjectives can have different meanings than predicative adjectives: The late Mr. Smith vs. Mr. Smith is late. Thus, RelEx attempts to distinguish these:

_amod(Smith, late)

for the first, and, for the second:

_predadj(Smith, late) 

In both cases, RelEx retains "late" as an adjectival modifier.

By contrast, the Stanford parser does not make any special distinction, and instead treats the adjective as if it were a verb, but then marks this pseudo-verb as copular:

 nsubj(late-4, Smith-2)
 cop(late-4, is-3)

Again, RelEx does this in order to simplify pattern-matching for reasoning and question answering: the RelEx markup is more normalized, more uniform to the actual semantics.

Prepositional relations

Rather than generating a prepositional object, and a prepositional complement (the way that many other parsers do), RelEx collapses both of these into a single prepositional relation, with the preposition linking the th object to its verb. As a result, RelEx generates at least as many prepositional relations as there are prepositions. Thus, for example, consider the sentence He went to the store. MiniPar outputs the following:

pobj (go, store)
pcomp (go, to)

RelEx will collapse these two into one relation:

to(go, store) 

Collapsing these two into one seems reasonable, as the result is completely unambiguous, (the traditional form can be easily obtained, if needed), and it is also easier to scan/comprehend with a quick glance. Some additional examples follow.

Preposition Relation Example Example Sentence
to to(go, store) He went to the store.
at at(look, building) He looked at the building.
off off(wander, field) He wandered off the field.
for for(jump, ball) He jumped for the ball.
by by(cause, linebacker) That fumble was caused by the linebacker.

Complements in the Stanford parser

The above example of a prepositional complement is treated the same way by both RelEx and the Stanford parser, the latter generating:

prep_to(go, store)

However, the Stanford parser does not always collapse the complement back into the relation. Below are examples where RelEx does collapse the complement, but Stanford does not. This is one reason why RelEx uses substantially few relations than Stanford. (RelEx could be easily modified to generate these, and perhaps should be, for compatibility reasons?)

Stanford type Example Sentence Stanford output RelEx output Link Grammar disjunct
Preposition He went to the store. prep_to(go, store) to(go, store) MVp- Js+
Adverbial clause modifier The accident happened as the night was falling. advcl(happen, fall)
mark(fall, as)
as(happen, fall) MVs- Cs+
Agent The man has been killed by the police. agent(kill, police)
Stanford does not indicate complement
by(kill, police) MVp- Jp+
Clausal complement He says that you like to swim ccomp(say, like)
complm(like, that)
that(say, like) TH- Cet+

Functional words in the Stanford parser

RelEx also differs from the Stanford parser in the way that "functional words" are handled. Examples of functional words are determiners ("the" in "the book") and passives ("has been" in "has been killed").

RelEx sticks closer to Tesnière's formulation of dependency: function words are grouped with their head word (are features of the head word), and do not participate in dependencies.

Passives and tenses

Stanford and RelEx differ in how passives, auxiliaries and verb tenses are handled. Again, RelEx attempts to provide a greater degree of semantic normalization, and is less preoccupied with presenting syntactic structure.

For example, the sentence: The man has been killed by the police results in the following Stanford output:

nsubjpass(killed, man)
aux(killed, has)
auxpass(killed, been)

which accurately reflects the syntactic structure of the sentence. By contrast, RelEx generates:

_obj(kill, man)
tense(kill, present_perfect_passive)

Here, the use of _obj follows from the semantically similar paraphrase: The police killed the man, while the tense markup indicates the passive nature of the syntactic construction.

In general, RelEx will present nsubjpass as the object, while collapsing aux and auxpass into a tense markup. The reason for doing this is that this makes pattern-matching for questions and for reasoning simpler, in that the pattens can handle tenses in a uniform way, without having to fiddle with aux/auxpass dependencies.

Determiners

Similar remarks apply for determiners. Consider "A book is on the tale." Here, Stanford generates the relations

det(book, a)
det(table, the)

By contrast, RelEx marks "table" with a feature tag:

definite-FLAG(table)

and doesn't mark "book" at all.

Participial modifiers

Consider the sentence: "Truffles picked during the spring are tasty." RelEx generates:

during(pick, spring)
_predadj(truffle, tasty)
_obj(pick, truffle)

whereas Stanford generates:

prep_during(pick, spring)
nsubj(tasty, truffle)
partmod(truffle, pick)
cop(tasty, be)

Aside from the _predadj/nsubj difference, discussed above, note the difference in _obj/partmod. Although the Stanford partmod is closer to the syntax of the sentance, RelEx chooses to generate _obj instead, as this seems to convey the semantic content of the sentence in a more direct way.

Other dependency relations

Other dependency grammars, such as Dekang Lin's miniPar or the Stanford parser, generate similar relations. The table below documents some of the other commonly used relation types, and their equivalents in RelEx. The point is that RelEX does not generate these relations.

Relation Description Example text Example relation comment
acomp adjectival complement The rose smelled sweet. accomp(smelled, sweet) Generated by the Stanford parser, identical to RelEx output _to-
advcl adverbial clause modifier The accident happened as the night was falling. advcl (happen, fall) Generated by Stanford, see complement discussion above.
appos appositional modifier Sam, my brother appos (Sam, brother) Generated by Stanford, identical to RelEx _appo.
aux (modal) auxiliary Reagan has died. aux (died, has) Relex indicates the tense feature: tense(die, present_perfect)
auxpass passive auxiliary Kennedy has been killed. auxpass(killed, been) Relex indicates the tense feature: tense(kill, present_perfect_passive)
ccomp clausal complement Generated by Stanford,
cop copula The rose smelled sweet. cop(smelled, sweet) Generated by MiniPar, identical to RelEx output _to-be
csubj clausal subject What she said makes sense. csubj (make, say) Generated by Stanford; Relex uses plain _subj.
det determiner of a noun the cookie det(cookie, the) RelEx uses the DEFINITE-FLAG feature instead.
gen genitive modifier of a noun Alice's cookie gen(cookie, Alice) Identical to RelEx output _poss.
infmod infinitival modifier Relex usually generates a plain _obj
mark complement of adverbial clause modifier The accident happened as the night was falling. mark(fall, as) Generated by Stanford, see complement discussion above.
neg negative haven't neg(have, n't) Relex usually generates NEGATIVE-FLAG(have, T)
nsubjpass passive nominal subject rocks were thrown nsubjpass(thrown, rocks) Discussed above. RelEx identifies these as _obj, and marks verb with passive feature.
num numeric modifier three dollars num(dollar, three) Identical to RelEx output _quantity.
partmod participial modifier RelEx usually generates a plain _obj.
pcomp complement of a preposition It happened in the garden pcomp(garden, in) RelEx uses the general prepositional relations instead, e.g. in(happen, garden)
prt phrasal verb particle prt(shut, down) They shut down the station. RelEx always contracts the particle to create a polyword: e.g. _subj(shut_down, they)
_obj(shut_down, station)
sc small clause complement of a verb. Alice forced him to stop. sc(stop, to) RelEx uses the general prepositional relations instead, e.g. to(force,stop)

Standard list of dependency relations

Below is a list of "industry standard" dependency relations, as defined in de Marneffe (2006). This is in turn derived from King et al. (2003), which in turn derives from Carroll et al. (1999). This list is the set of dependencies currently used by the Stanford parser.

The list is structured as a tree, with dep as the root of the tree. Marking a relation as being of type dep simply implies that there is a dependency relation between words. Ideally, relations are marked with the most refined relation possible.

dep - dependent

aux - auxiliary
auxpass - passive auxiliary
cop - copula
conj - conjunct
cc - coordination
arg - argument
subj - subject
nsubj - nominal subject
nsubjpass - passive nominal subject
csubj - clausal subject
comp - complement
obj - object
dobj - direct object
iobj - indirect object
pobj - object of preposition
attr - attributive
ccomp - clausal complement with internal subject
xcomp - clausal complement with external subject
compl - complementizer
mark - marker (word introducing an advcl)
rel - relative (word introducing a rcmod)
acomp - adjectival complement
agent - agent
ref - referent
expl - expletive (expletive there)
mod - modifier
advcl - adverbial clause modifier
purpcl - purpose clause modifier
tmod - temporal modifier
rcmod - relative clause modifier
amod - adjectival modifier
infmod - infinitival modifier
partmod - participial modifier
num - numeric modifier
number - element of compound number
appos - appositional modifier
nn - noun compound modifier
abbrev - abbreviation modifier
advmod - adverbial modifier
neg - negation modifier
poss - possession modifier
possessive - possessive modifier ('s)
prt - phrasal verb particle
det - determiner
prep - prepositional modifier (also pmod, sometimes)
sdep - semantic dependent
xsubj - controlling subject

References

  • Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D. Manning, Generating Typed Dependency Parses from Phrase Structure Parses (2006)
  • John Carroll, Guido Minnen, and Ted Briscoe. 1999. "Corpus annotation for parser evaluation". In Proceedings of the EACL workshop on Linguistically Interpreted Corpora (LINC).
  • Tracy H. King, Richard Crouch, Stefan Riezler, Mary Dalrymple, and Ronald Kaplan. 2003. "The PARC 700 dependency bank". In 4th International Workshop on Linguistically Interpreted Corpora (LINC-03).