next up previous contents
Next: Syntactically annotated corpora Up: Practical NLP lexicons Previous: A comparative overview

Preliminary Recommendations

Summary

Among the lexicons considered, there appears to be substantive agreement as to the range of linguistic phenomena that are taken into account in the lexical entry of a verb (as shown in table 3.3), much less so as to the way these phenomena are to be encoded. In some cases, the relevant information is only indirectly encoded but nonetheless inferrable (as indicated by the label `infer' below) from other pieces of available information. It is interesting to note, here, that the main area of disagreement revolves around the boundary line between syntax and semantics, as shown in the last two columns of our table, which are the only ones where some minus signs appear (meaning that the phenomenon considered is not accounted for). The fact that the information stored in each lexicon is not uniformly distributed across the various levels of linguistic description has a number of consequences:

  1. The lexicons differ greatly in the criterial properties they adopt for the definition of an individual lexical entry: for instance, if, for some lexicons (e.g. Comlex), different syntactic realisations of the same complement give rise to different lexical entries, other lexicons (e.g. Acquilex and LDOCE) collapse different argument structures of a verb in one entry provided that they share a common core meaning.
  2. Some lexicons try to explain at one single level what other lexicons tend to distribute over more than one level; a paradigmatic example is the distinction between control and raising verbs, which is located at the semantic level by those lexicons which avail themselves of this level (e.g. Acquilex), while being captured in purely syntactic terms in those lexicons, such as Comlex, which do not resort to semantics; yet some other lexicons (e.g. PLNLP) neutralise the distinction itself by using a unique category to encompass both cases.
  3. As a consequence of both 1 and 2, although there exists a common set of observational linguistic phenomena which all lexicons agree in taking into account, the way these phenomena are related to one another in each lexicon varies considerably.

 

Lexicon arg. syn. funct. control lexical morphsyn. frame deep
arity cat. role select. constr. altern. struct.
Acquilex infer infer infer + + + + +
Comlex infer + + + + + + -
Eurotra infer + + + + + - +
Genelex infer + + + + + + -
ILC + + + + + + + -
LDOCE infer infer infer + + + infer -
PLNLP infer infer infer + + + infer -
Table 3.3: Lexical entry information for verbs 

A final point worth stressing in this context is that some lexicons, due to their theoretical biases and/or their application-driven needs, appear to be rather prescriptive as to the sort of information which should be put in the lexicon: e.g. Eurotra allows a maximum of four arguments in a verb entry, each of which can only be assigned a non-terminal syntactic category. The Genelex attitude in this respect is far more liberal.



next up previous contents
Next: Syntactically annotated corpora Up: Practical NLP lexicons Previous: A comparative overview