RelEx generates dependency relations (also know as binary relations) that connect pairs of words or phrases, and name the relationship between these parts. Thus, for example, int the sentence "John threw the ball", "John" is the subject who is doing the throwing, and "the ball" is the object being thrown. This is denoted as
_subj (throw, John) _obj (throw, ball)
The ordering of the dependency is such that the head is listed first (in the example, the head is throw), and the dependent is listed second. These dependency relations come in two forms: a fixed number of predefined relations, and a much larger number (hundreds) of prepositional relations.
This page also reviews some of the ways in which RelEx differs from two other popular dependency parsers: MiniPar and the Stanford Parser. As a general idea, RelEx attempts a greater degree of semantic abstraction; that is, it is less aimed at presenting the syntactic structure of sentence, than it is in presenting its semantic content. This shows itself in several ways:
- RelEx attempts basic entity extraction, and thus avoids generating nn noun modifier relations for named entities.
- RelEx will collapse the object and complement of a preposition into one. Stanford will do this for some, but not all relationships.
- RelEx will convert passive subjects into objects, and instead indicate passiveness by tagging the verb with a passive tense feature.
- RelEx avoids generating copulas, if at all possible, and instead indicates copular relations as predicative adjectives, or in other ways.
- RelEx extracts semantic variables from questions, with the intent of simplifying question answering. For example, Where is the ball? generates _pobj(_%atLocation, _$qVar) _psubj(_%atLocation, ball), which can then pattern-match a plausible answer: _pobj(under, couch).
- RelEx attempts to extract comparison variables.
- 1 Table of dependencies
- 2 Parsing questions
- 3 Comparisons to other parsers
- 4 Standard list of dependency relations
Table of dependencies
The fixed, named dependency relations that occur in statements are given in the table below. Some additional relations are generated for questions, these are described in the next section. This table is meant to be authoritative.
|RelEx relation||Formal description||Example Relation||Example Sentence|
|_amod||attributive adjectival modifier||_amod(door, locked)||The locked door fell open.|
|_advmod||adverbial modifier||_advmod(run, quickly)||Jim runs quickly.|
|_appo||appositive of a noun (appositional modifier)||_appo(bird, robin)||The bird, a robin, sang sweetly.|
|_comp||complement||_comp(that, fly)||I think that dogs can fly.|
|_comparative||comparative (handle more/less relationships)||_comparative(quickly, run)||He runs more/less quickly than John.|
|_date_day||day of month||_date_day(December, 3rd)||It happened on December 3rd, 1990.|
|_date_year||year modifier||_date_year(December, 1990)||It happened on December 3rd, 1990.|
|_expl||expletive/filler subject||_expl(be, there)||There is a place we can go.|
|_iobj||indirect object||_iobj(give, quarterback)||The linebacker gave the quarterback a push.|
|_measure_distance||unit of distance measure||_measure_distance(away, foot)||The boy is 4 feet away.|
|_measure_per||unit of repetition.||_measure_per(times, day)||Take these 4 times a day.|
|_measure_size||unit of size measure||_measure_size(tall, feet)||The boy is 4 feet tall.|
|_measure_time||unit of time measure.||_measure_time(old, years)||The birthday boy is 12 years old|
|_modal||modal verb||_modal(can, fly)||I think that dogs can fly.|
|_nn||prenomial modifier of a noun||_nn(line, goal)||He stood at the goal line.|
|_obj||direct object, also used for passive nominal subject.||_obj(eat, sandwich)||He ate the sandwich.|
|_parataxis||parataxis||_parataxis(leave, say)||The guy, John said, left early in the morning.|
|_pobj||object of preposition||_pobj(next_to, house)||The garage is next to the house.|
|_poss||possessive or genitive modifier of a noun (gen)||_poss(hand, John)||John's hand slipped.|
|_predadj||predicative adjectival modifier||_predadj(Smith, late)||Mr. Smith is late.|
|_predet||predeterminer||_predet(boy, all)||All the boys knew it.|
|_psubj||subject of preposotion||_psubj(next_to, garage)||The garage is next to the house.|
|_quantity||numeric modifier||_quantity(dollar, three)||He lost three dollars.|
|_quantity_mod||quantity modifier||_quantity_mod(three, almost)||He lost almost three dollars.|
|_quantity_mult||quantity multiplier||_quantity_mult(hundred, three)||He lost three hundred dollars.|
|_rep||representation (claim, statement, thought)||_rep(think, that)||I think that dogs can fly.|
|_repq||representational question||_repq(ask, how)||Tom asked me how to do it.|
|_subj||subject of a verb||_subj(be, he) _subj(do, one)||He is the one that did it.|
|_that||null/implied preposition "that"||_that(know, angry)||He knew I was angry.|
|_to-be||adjectival complement (acomp)||_to-be(smell, sweet)||The rose smelled sweet.|
|_to-do||clausal complement (ccomp/xcomp)||_to-do(like, row)||Linas likes to row.|
Any given relation may occur multiple times in one sentence, including _subj, _obj; for example "He is the one that did it." -- the object of "he" is the subject of "do".
The following dependencies are generated during the analysis of questions. These are meant to identify the variables in a question. Thus, for example, consider the question-answer pair: Where is the ball? The ball is under the couch. The question generates
_%atLocation(_%copula, _$qVar) _subj(_%copula, ball)
while the corresponding statement generates:
under(be, couch) _subj(be, ball)
Treating the above as a pattern matching problem, one can see that the question is answered by grounding the query variables as follows:
_%atLocation -> under _%copula -> be _$qVar -> couch
|RelEx relation||Formal description||Example Relation||Example Question||Response Pattern|
|_%atLocation||location query||_%atLocation(happen, _$qVar)||Where did that happen?||It happened in the zoo.|
|_%atTime||time query||_%atTime(happen, _$qVar)||When did that happen?||It happened at 5 o'clock.|
|_%because||causative query||_%because(fall, _$qVar)||Why did he fall?||He fell because he tripped.|
|_%how||method query||_%how(do, _$qVar)||How should I do it?||Do it like this.|
|_%howdeg||how-of-degree query||How quickly does he run?|
Comparisons to other parsers
Other dependency grammars, such as Dekang Lin's MiniPar or the Stanford parser, generate similar relations. In the case of the Stanford parser, RelEx can be placed into a "compatibility mode", wherein it generates exactly the same output. This is useful both for testing, and because the RelEx parser is more than three times faster. However, the "native" RelEx output differs from the Stanford parser in important ways.
The notes below document some of the differences between RelEx and these other systems, and provides some rationale for these differences. As a general rule, the Stanford parser attempts to be more "syntactically precise", reflecting syntactic relations in it's dependencies, whereas RelEx tries to be more "semantic", in that it tries to normalize different syntactic constructions to obtain the same semantic forms. The goal of this greater semantic normalization is to provide a more reliable/simpler substrate for semantic reasoning and question-answering.
To perform this "semantic normalization" without loosing important information, RelEx makes extensive use of feature tags, which the Stanford parser does not generate.
Within the context of "Meaning-Text Theory", advanced by Igor Mel'čuk, et al, one would say that the output of the Stanford parser is strictly limited to the SSyntR or surface syntactic representation of a sentence, while RelEx, in general, attempts to come closer to the DSyntR, or deep syntactic representation of a sentence (see ref below).
These differences are not set in stone: RelEx is malleable; and in some cases, RelEx is surely buggy. These are "working notes" rather than a "finished work".
RelEx attempts basic entity extraction, and generates polyword phrases in its output. For example, "I live in New York" parses as:
_subj(live, I) in(live, New_York)
whereas the Stanford parser generates the more syntactically literal nn dependencies:
nsubj(live, I) nn(York, New) prep_in(live, York)
Predicative adjectives, copulas
Note that attributive adjectives can have different meanings than predicative adjectives: The late Mr. Smith vs. Mr. Smith is late. Thus, RelEx attempts to distinguish these:
for the first, and, for the second:
In both cases, RelEx retains "late" as an adjectival modifier.
By contrast, the Stanford parser does not make any special distinction, and instead treats the adjective as if it were a verb, but then marks this pseudo-verb as copular:
nsubj(late-4, Smith-2) cop(late-4, is-3)
Again, RelEx does this in order to simplify pattern-matching for reasoning and question answering: the RelEx markup is more normalized, more uniform to the actual semantics.
Rather than generating a prepositional object, and a prepositional complement (the way that many other parsers do), RelEx collapses both of these into a single prepositional relation, with the preposition linking the th object to its verb. As a result, RelEx generates at least as many prepositional relations as there are prepositions. Thus, for example, consider the sentence He went to the store. MiniPar outputs the following:
pobj (go, store) pcomp (go, to)
RelEx will collapse these two into one relation:
Collapsing these two into one seems reasonable, as the result is completely unambiguous, (the traditional form can be easily obtained, if needed), and it is also easier to scan/comprehend with a quick glance. Some additional examples follow.
|Preposition||Relation Example||Example Sentence|
|to||to(go, store)||He went to the store.|
|at||at(look, building)||He looked at the building.|
|off||off(wander, field)||He wandered off the field.|
|for||for(jump, ball)||He jumped for the ball.|
|by||by(cause, linebacker)||That fumble was caused by the linebacker.|
Complements in the Stanford parser
The above example of a prepositional complement is treated the same way by both RelEx and the Stanford parser, the latter generating:
However, the Stanford parser does not always collapse the complement back into the relation. Below are examples where RelEx does collapse the complement, but Stanford does not. This is one reason why RelEx uses substantially few relations than Stanford. (RelEx could be easily modified to generate these, and perhaps should be, for compatibility reasons?)
|Stanford type||Example Sentence||Stanford output||RelEx output||Link Grammar disjunct|
|Preposition||He went to the store.||prep_to(go, store)||to(go, store)||MVp- Js+|
|Adverbial clause modifier||The accident happened as the night was falling.|| advcl(happen, fall)
|as(happen, fall)||MVs- Cs+|
|Agent||The man has been killed by the police.|| agent(kill, police)
Stanford does not indicate complement
|by(kill, police)||MVp- Jp+|
|Clausal complement||He says that you like to swim|| ccomp(say, like)
|that(say, like)||TH- Cet+|
Functional words in the Stanford parser
RelEx also differs from the Stanford parser in the way that "functional words" are handled. Examples of functional words are determiners ("the" in "the book") and passives ("has been" in "has been killed").
RelEx sticks closer to Tesnière's formulation of dependency: function words are grouped with their head word (are features of the head word), and do not participate in dependencies.
Passives and tenses
Stanford and RelEx differ in how passives, auxiliaries and verb tenses are handled. Again, RelEx attempts to provide a greater degree of semantic normalization, and is less preoccupied with presenting syntactic structure.
For example, the sentence: The man has been killed by the police results in the following Stanford output:
nsubjpass(killed, man) aux(killed, has) auxpass(killed, been)
which accurately reflects the syntactic structure of the sentence. By contrast, RelEx generates:
_obj(kill, man) tense(kill, present_perfect_passive)
Here, the use of _obj follows from the semantically similar paraphrase: The police killed the man, while the tense markup indicates the passive nature of the syntactic construction.
In general, RelEx will present nsubjpass as the object, while collapsing aux and auxpass into a tense markup. The reason for doing this is that this makes pattern-matching for questions and for reasoning simpler, in that the pattens can handle tenses in a uniform way, without having to fiddle with aux/auxpass dependencies.
Similar remarks apply for determiners. Consider "A book is on the tale." Here, Stanford generates the relations
det(book, a) det(table, the)
By contrast, RelEx marks "table" with a feature tag:
and doesn't mark "book" at all.
Consider the sentence: "Truffles picked during the spring are tasty." RelEx generates:
during(pick, spring) _predadj(truffle, tasty) _obj(pick, truffle)
whereas Stanford generates:
prep_during(pick, spring) nsubj(tasty, truffle) partmod(truffle, pick) cop(tasty, be)
Aside from the _predadj/nsubj difference, discussed above, note the difference in _obj/partmod. Although the Stanford partmod is closer to the syntax of the sentance, RelEx chooses to generate _obj instead, as this seems to convey the semantic content of the sentence in a more direct way.
Other dependency relations
Other dependency grammars, such as Dekang Lin's miniPar or the Stanford parser, generate similar relations. The table below documents some of the other commonly used relation types, and their equivalents in RelEx. The point is that RelEX does not generate these relations.
|Relation||Description||Example text||Example relation||comment|
|acomp||adjectival complement||The rose smelled sweet.||accomp(smelled, sweet)||Generated by the Stanford parser, identical to RelEx output _to-|
|advcl||adverbial clause modifier||The accident happened as the night was falling.||advcl (happen, fall)||Generated by Stanford, see complement discussion above.|
|appos||appositional modifier||Sam, my brother||appos (Sam, brother)||Generated by Stanford, identical to RelEx _appo.|
|aux||(modal) auxiliary||Reagan has died.||aux (died, has)||Relex indicates the tense feature: tense(die, present_perfect)|
|auxpass||passive auxiliary||Kennedy has been killed.||auxpass(killed, been)||Relex indicates the tense feature: tense(kill, present_perfect_passive)|
|ccomp||clausal complement||Generated by Stanford,|
|cop||copula||The rose smelled sweet.||cop(smelled, sweet)||Generated by MiniPar, identical to RelEx output _to-be|
|csubj||clausal subject||What she said makes sense.||csubj (make, say)||Generated by Stanford; Relex uses plain _subj.|
|det||determiner of a noun||the cookie||det(cookie, the)||RelEx uses the DEFINITE-FLAG feature instead.|
|gen||genitive modifier of a noun||Alice's cookie||gen(cookie, Alice)||Identical to RelEx output _poss.|
|infmod||infinitival modifier||Relex usually generates a plain _obj|
|mark||complement of adverbial clause modifier||The accident happened as the night was falling.||mark(fall, as)||Generated by Stanford, see complement discussion above.|
|neg||negative||haven't||neg(have, n't)||Relex usually generates NEGATIVE-FLAG(have, T)|
|nsubjpass||passive nominal subject||rocks were thrown||nsubjpass(thrown, rocks)||Discussed above. RelEx identifies these as _obj, and marks verb with passive feature.|
|num||numeric modifier||three dollars||num(dollar, three)||Identical to RelEx output _quantity.|
|partmod||participial modifier||RelEx usually generates a plain _obj.|
|pcomp||complement of a preposition||It happened in the garden||pcomp(garden, in)||RelEx uses the general prepositional relations instead, e.g. in(happen, garden)|
|prt||phrasal verb particle||prt(shut, down)||They shut down the station.|| RelEx always contracts the particle to create a polyword: e.g. _subj(shut_down, they) |
|sc||small clause complement of a verb.||Alice forced him to stop.||sc(stop, to)||RelEx uses the general prepositional relations instead, e.g. to(force,stop)|
Standard list of dependency relations
Below is a list of "industry standard" dependency relations, as defined in de Marneffe (2006). This is in turn derived from King et al. (2003), which in turn derives from Carroll et al. (1999). This list is the set of dependencies currently used by the Stanford parser.
The list is structured as a tree, with dep as the root of the tree. Marking a relation as being of type dep simply implies that there is a dependency relation between words. Ideally, relations are marked with the most refined relation possible.
dep - dependent
- aux - auxiliary
- auxpass - passive auxiliary
- cop - copula
- conj - conjunct
- cc - coordination
- arg - argument
- subj - subject
- nsubj - nominal subject
- nsubjpass - passive nominal subject
- csubj - clausal subject
- nsubj - nominal subject
- comp - complement
- obj - object
- dobj - direct object
- iobj - indirect object
- pobj - object of preposition
- attr - attributive
- ccomp - clausal complement with internal subject
- xcomp - clausal complement with external subject
- compl - complementizer
- mark - marker (word introducing an advcl)
- rel - relative (word introducing a rcmod)
- acomp - adjectival complement
- obj - object
- agent - agent
- subj - subject
- ref - referent
- expl - expletive (expletive there)
- mod - modifier
- advcl - adverbial clause modifier
- purpcl - purpose clause modifier
- tmod - temporal modifier
- rcmod - relative clause modifier
- amod - adjectival modifier
- infmod - infinitival modifier
- partmod - participial modifier
- num - numeric modifier
- number - element of compound number
- appos - appositional modifier
- nn - noun compound modifier
- abbrev - abbreviation modifier
- advmod - adverbial modifier
- neg - negation modifier
- poss - possession modifier
- possessive - possessive modifier ('s)
- prt - phrasal verb particle
- det - determiner
- prep - prepositional modifier (also pmod, sometimes)
- sdep - semantic dependent
- xsubj - controlling subject
- Igor A. Mel'čuk and Alain Polguère, (1987) "A Formal Lexicon in Meaning-Text Theory", Computational Linguistics, vol. 13, pp. 261-275.
- Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D. Manning, Generating Typed Dependency Parses from Phrase Structure Parses (2006)
- John Carroll, Guido Minnen, and Ted Briscoe. 1999. "Corpus annotation for parser evaluation". In Proceedings of the EACL workshop on Linguistically Interpreted Corpora (LINC).
- Tracy H. King, Richard Crouch, Stefan Riezler, Mary Dalrymple, and Ronald Kaplan. 2003. "The PARC 700 dependency bank". In 4th International Workshop on Linguistically Interpreted Corpora (LINC-03).
- Marie-Catherine de Marneffe and Christopher D. Manning, Stanford typed dependencies manual, September 2008