next up previous contents
Next: Nouns, Nominalizations and Noun Up: Linguistic aspects of lexical Previous: Lexicalization


Verb Semantic Classes


The following approaches to building verb semantic classes are outlined in this section: verb classes based on syntactic behaviour (alternations), and verb classes formed from semantic criteria such as thematic roles and elements of Lexical Conceptual Structure. Classifications related to WordNet criteria are discussed in the section devoted to WordNet (3.4.2, 3.4.3). Each of these approaches contribute to a different form of classification, whose usefulness and ease of formation will be evaluated.

The main practical aim of verb semantic classifications is to contribute to structure the lexicon and to allow for a better organized, more homogeneous,description, of their semantics. ¿From a more formal point of view, the main aims are the identification of meaning components forming the semantics of verbs, the specification of more subtle meaning elements that differentiate closely related verbs and the study of the cooperation between syntax and semantics.

Description of different approaches

Syntactic Alternations and Verb Semantic Classes

In her book, B. Levin [Lev93] shows, for a large set of English verbs (about 3200), the correlations between the semantics of verbs and their syntactic behavior. More precisely, she shows that some facets of the semantics of verbs have strong correlations with the syntactic behavior of these verbs and with the interpretation of their arguments.

She first precisely delimits the different forms of verb syntactic behavior. Each of these forms is described by one or more alternation (e.g. alternations describe passive forms, there-insertions and reflexive forms). Then, she proposes an analysis of English verbs according to these alternations: each verb is associated with the set of alternations it undergoes. A preliminary investigation showed that there are sufficient correlations between some facets of the semantics of verbs and their syntactic behavior to allow for the formation of classes. ¿From these observations, Beth Levin has then defined about 200 verb semantic classes, where, in each class, verbs share a certain number of alternations.

This very important work emerged from the synthesis of specific investigations on particular sets of verbs (e.g. movement verbs), on specific syntactic behaviors and on various types of information extracted form corpora. Other authors have studied in detail the semantics conveyed by alternations e.g. [Pin89] and the links between them [Gol94].

The alternation system

An alternation, roughly speaking, describes a change in the realization of the argument structure of a verb. The scope of an alternation is the proposition. Modifiers are considered in some cases, but the main structures remain the arguments and the verb. Arguments may be deleted or `moved', NPs may become PPs or vice-versa, and some PPs may be introduced by a new preposition. Alternations may also be restricted by means of constraints on their arguments.

Beth Levin has defined 79 alternations for English. They basically describe `transformations' from a `basic' form. However, these alternations have a priori little to do with the assumptions of Government and Binding theory and Movement theory, in spite of some similitudes. The form assumed to be basic usually corresponds to the direct realization of the argument structure, although clearly this point of view may be subject to debate. Here are a few, among the most common, types of alternations.

The Transitivity alternations introduce a change in the verb's transitivity. In a number of these alternations the subject NP is deleted and one of the objects becomes the subject, which must ,in English, be realized. The Middle alternation is typical of this change:
John cuts the cake $\rightarrow$ The cake cuts easily.
As can be noticed, it is often necessary to add an adverb to make the sentence acceptable. The Causative/inchoative alternation concerns a different set of verbs:
Edith broke the window $\rightarrow$ The window broke.
Verbs undergoing this alternation can roughly be characterized as verbs of change of state or position.

Under transitivity alternations fall also alternations where an object is unexpressed. This is the case for the Unexpressed object alternation where the object1 is not realized. A number of verbs undergo this alternation. In most cases, the `typical' object is `implicit' or `incorporated' into the verb, or deducible from the subject and the verb. This is the case, e.g., for the Characteristic property of agent alternation:
This dog bites people $\rightarrow$ This dog bites.
We also find alternations that change the object NP into a PP, as in the conative alternation:
Edith cuts the bread $\rightarrow$ Edith cuts at the bread.

Other sets of alternations include the introduction of oblique complements, reflexives, passives, there-insertion, different forms of inversions and the introduction of specific words such as the way-construction.

It is clear that these alternations are specific to English. They are not universal, even though some are shared by several languages (e.g. the passive alternation). Every language has its own alternation system, and has a more or less important number of alternations. The characteristics of the language, such as case marking, are also an important factor of variation of the form, the status and the number of alternations. English seems to have a quite large number of alternations; this is also the case e.g. for ancient languages such as Greek. French and Romance languages in general have much fewer alternations, their syntax is, in a certain way, more rigid. The number of alternations also depends on the way they are defined, in particular the degree of generality via constraints imposed on context elements is a major factor of variation.

Construction of verb semantic classes

Verb semantic classes are then constructed from verbs, modulo exceptions, which undergo a certain number of alternations. ¿From this classification, a set of verb semantic classes is organized. We have, for example, the classes of verbs of putting, which include Put verbs, Funnel Verbs, Verbs of putting in a specified direction, Pour verbs, Coil verbs, etc. Other sets of classes include Verbs of removing, Verbs of Carrying and Sending, Verbs of Throwing, Hold and Keep verbs, Verbs of contact by impact, Image creation verbs, Verbs of creation and transformation, Verbs with predicative complements, Verbs of perception, Verbs of desire, Verbs of communication, Verbs of social interaction, etc. As can be noticed, these classes only partially overlap with the classification adopted in WordNet. This is not surprising since the classification criteria are very different.

Let us now look in more depth at a few classes and somewhat evaluate the use of such classes for natural language applications (note that several research projects make an intensive use of B. Levin's classes). Note that, w.r.t. WordNet, the classes obtained via alternations are much less hierarchically structured, which shows that the two approaches are really orthogonal.

There are other aspects which may weaken the practical use of this approach, in spite of its obvious high linguistic interest, from both theoretical and practical viewpoints. The first point is that the semantic definition of some classes is somewhat fuzzy and does not really summarize the semantics of the verbs it contains. An alternative would be to characterize a class by a set of features, shared to various extents by the verbs it is composed of. Next, w.r.t. the semantic characterization of the class, there are some verbs which seem to be really outside the class. Also, as illustrated below, a set of classes (such as movement verbs) does not include all the `natural' classes one may expect (but `completeness' or exhaustiveness has never been claimed to be one of the objectives of this research). This may explain the unexpected presence of some verbs in a class. Finally, distinctions between classes are sometimes hard to make, and this is reinforced by the fact that classes may unexpectedly have several verbs in common. Let us illustrate these observations with respect to two very representative sets of classes: verbs of motion and verbs of transfer of possession (notice that a few other classes of transfer of possession, e.g. deprivation, are in the set of classes of Remove verbs).

Verbs of Motion include 9 classes:

Note that the labels `Roll' and `Run' do not totally cover the semantics of the verbs in the corresponding class. Also, the difference between the two classes is not very clear. Waltz and chase verbs are interesting examples of very specific classes which can be constructed from alternations. However, few domains are represented, and major ones are missing or under-represented (e.g. type of movement, medium of movement, manner of motion, etc.).

Verbs of transfer of possession include 9 classes:

In this example, the difficulty of defining the semantics of a class is evident, e.g.: fulfilling, future having: these terms do not exactly characterize the class. Note also the Get class is very large and contains very diverse verbs. Domain descriptions (family, education, law, etc.) as well as moral judgements on the transfer (legal, illegal, robbery) are not accounted for in this classification.

About the underlying semantics of alternations

It is of much interest to analyze in depth the set of verbs which undergo an alternation. It is also interesting to analyze exceptions, i.e. verbs not associated with an alternation but which are closely related to verbs which are associated with it, in order to narrow down the semantic characterization of this alternation.

Besides the theoretical interest, the underlying semantics conveyed by syntactic construction plays an important role in semantic composition and in the formation of lexicalization patterns 2.5.1.

There is, first, a principle of non-synonymy of grammatical forms: `a difference in syntactic form always spells a difference in meaning' which is commonly assumed. We have, for example, the following syntactic forms with their associated Lexical Semantic Template (Goldberg 94):

¿From these general observations, we see that form and meaning cannot be considered apart. ¿From the point of view of the principle of compositionality, the meaning of a sentence should not only be derived from the meaning of its components, but it should also include the implicit, partial semantics associated with the syntactic construction. Let us now consider several examples.

About the identification of relevant meaning components

The problem addressed here is the identification in verbs of those meaning components which determine whether a verb does or does not undergo a certain alternation. ([Pin89], pp. 104 and following), explains that in the conative construction, where the transitive verb takes an oblique object introduced by the preposition at instead of a direct NP, there is the idea that the subject is attempting to affect the oblique object, but may not succeed. But the conative alternation applies to much narrower sets of verbs than those whose actions could be just attempted and not realized. For example, verbs of cutting and verbs of hitting all undergo the alternation, but verbs of touching and verbs of breaking do not.

It turns out, in fact, that verbs participating in the conative construction describe a certain type of motion and a certain type of contact.

The same situation occurs for the Part-possessor ascension alternation (Ann cuts John's arm $\leftrightarrow$ Ann cuts John on the arm) which is also accepted by verbs of motion followed by contact. Here verbs of breaking do not participate in that alternation whereas verbs of hitting and touching do.

Finally, the Middle alternation, which specifies the ease with which an action can be performed on a theme, is accepted only by verbs that entail a real effect, regardless of whether they involve motion or contact. Therefore, verbs of breaking and of cutting undergo this alternation whereas verbs of touching do not.

As can be seen from these examples, a common set of elements of meaning can be defined for a set of alternations, such as motion, contact and effect, which contributes to differentiating the semantics conveyed by alternations, and therefore to characterizing quite precisely verbs which potentially undergo an alternation or not. Therefore, membership of a verb in a class depends on some aspects of meaning that the semantic representation of the verb constrains. These aspects may moreover be surprisingly subtle and refined, and difficult to identify and to describe in a formal system. These observations reinforce the arguments in favor of a certain autonomy of lexical semantics.

The dative alternation

The dative alternation applies to a number of verbs of transfer of possession, but the semantic components which account for the difference between verbs which do accept it and those which do not are very subtle. This alternation conveys the idea of X CAUSE Y to HAVE Z. However, as noted by [Pin89], while the class of verb of instantaneous imparting of force causing a ballistic motion (throw, flip, slap) allow the dative alternation, the verbs of continuous imparting of force in some manner causing accompanied motion do not (pull, push, lift).

Similarly, verbs where "X commits himself that Y will get Z in the future" allow the dative alternation (offer, promise, allocate, allot, assign). There are also verb classes which accept either one or the other form of the dative alternation (with or without the preposition to). Verbs of 'long-distance' communication (fax, telephone) also accept this alternation.

¿From these examples, it is possible to deduce that the dative alternation is accepted by verbs where the actor acts on a recipient (or a destination) in such a way that causes him to possess something. This is opposed to the actor acting on an object so that it causes it to go to someone. For example, in verbs like push, the actor does not have in mind a priori the destination, but just the object being pushed. On the contrary, ask accepts the dative alternation because when someone is asking something he has (first) in mind the way the listener will react; the `physical' transfer of the information is in general less important.

The location alternations

The location alternations (a family of alternation which involve a permutation of object1 and object2 and a preposition change) are also of much interest. Participation in certain of these alternations allows one to predict the type of motion and the nature of the end state. Verbs which focus only either on the motion (e.g. pour) or on the resulting state (e.g. fill) do not alternate. Verbs that alternate constrain in some manner both motion and end state. Let us now specify in more depth these constraints, since in fact quite a few verbs do alternate.

For example, let us consider the into/with alternation. [Pin89] differentiates among verbs which more naturally accept the into form as their basic form and those which alternate with a with form. Their general form is:
Verb NP(+theme) onto NP(+destination), and they alternate in:
Verb NP(+destination) with NP(+theme).
These verbs naturally take the theme as object (e.g. pile). Other verbs more naturally take the location/container as object (e.g. stuff), their basic form is more naturally:
Verb NP(location) with NP(+theme), and alternate in:
Verb NP(+theme) onto NP(+destination).
For these two types of constructions, only a very few verbs strictly require the presence of the two objects.

If we now consider the first set of verbs, those whose basic form is more naturally the `into/onto' form, then verbs which have one of the following properties alternate: simultaneous forceful contact and motion of a mass against a surface (brush, spread, ...), vertical arrangement on a horizontal surface (heap, pile, stack), force imparted to a mass, causing ballistic motion along a certain trajectory (inject, spray, spatter), etc. Those which do not alternate have for example one of the following properties: a mass is enabled to move via gravity (spill, drip, spill), a flexible object extended in one direction is put around another object (coil, spin, twist, wind), a mass is expelled from inside an entity (emit, expectorate, vomit). As can be seen here, the properties at stake are very precise and their identification is not totally trivial, especially for verbs which can be used in a variety of utterances, with some slight meaning variations.

These properties are derived from the observation of syntactic behaviors. While some properties seem to have a clear ontological status, others seem to be much more difficult to characterize. They seem to be a conglomeration of some form of more basic properties.

Semantics of the verb and semantics of the construction

Let us now consider the combination of a verb, with its own semantics, within a syntactic construction. The Construction Grammar approach [Gol94] sheds a particularly clear and insightful light on this interaction; let us present here some of its aspects, relevant to the verb semantic class system. The first point concerns the nature of the verb semantics, the nature of the semantics of a construction and the characterization of the interactions between these two elements. The second point concerns the meaning relations between constructions. These elements are of much importance for lexicalization and the construction of propositions 2.5.1.

Verbs usually have a central use, characterized by a specific syntactic form, but they may also be used with a large variety of other syntactic forms. In this case, the meaning of the proposition may be quite remote from the initial meaning of the verb. Let us consider a few illustrative cases. In:
Edith baked Mary a cake.
the initial sense of bake becomes somewhat marginal, in favor of a more global meaning:
`Edith INTENDS to CAUSE Mary TO HAVE cake'.
There is not here a special sense of bake which is used, but bake describes a kind of `manner' of giving Mary a cake.

Consider now the case of slide, suggested by B. Levin. ¿From the two following sentences:
Edith slid Susan/*the door the present.
Edith slid the present to Susan/to the door.
one may conclude that there are two senses for slide (probably very close). The first sense would constrain the goal to be animate while the second would have no constraint. Now, if we insist, in the ditransitive construction, that the goal must be animate, then we can postulate just one sense for slide, which is intuitively more conceptually appropriate. We then need to posit constraints in the alternations on the nature of the arguments which would then allow only those verbs which meet the constraints to undergo that alternation. As noticed very early by Lakoff, a verb alone (and its associated lexical semantics) cannot be used to determine whether a construction is acceptable, it is necessary to take into account the semantics of the arguments.

Depending on the construction and on the verb, the verb may either play an important part in the elaboration of the semantics of the proposition or may simply express the means, the manner, the circumstances or the result of the action, while the construction describes the `central' meaning. In fact, the meanings of verbs and of constructions often interact in very subtle ways. One might conclude then that there is no longer a clear separation between lexical rules and syntactic rules.

The difficulty is then to identify and describe the syntactically relevant aspects of verb meaning, i.e. those aspects which are relevant for determining the syntactic expression of arguments, via linking rules. Pinker notes that these aspects should exist in small number, since they resemble characteristics of closed-classes. This is not very surprising, since syntactic alternations form a set of closed elements.

Classification of verbs with respect to semantic properties relevant for describing thematic relations

Having dealt with alternations, let's turn to thematic relations and their role in the classification of verbs. Thematic relations express generalizations on the types of lexical functions that are established between the verb and its arguments in the predication. There is a consensus among researchers that assignment of thematic roles to the arguments of the predicate imposes a classification on the verbs of the language. Since the type of thematic roles and their number are determined by the meaning of the verb, the lexical decomposition of verb meanings seems to be a prerequisite for semantic classification of verbs. The close affinity between the compositional and relational lexical meanings plays a central role in the classifications of verbs outlined in this subsection. (2.4.1, 2.5.2 2.6.2).

Verb classifications that are surveyed below were developed within the frameworks of Case Grammar and Role and Reference Grammar (RRG). Works of Chafe [Cha70], Cook [Coo79] and Longacre [Lon76] address the issues of verb classification with regard to thematic roles within the framework of the Case Grammar model. RRG, a structural-functionalist theory of grammar, is presented in works of Foley and Van Valin [Fol84] and Van Valin [VanV93]. Characteristic of RRG is that it accounts for a detailed treatment of lexical representation that proves to be instrumental in describing the thematic relations in typologically different languages. It also incorporates the insights of Dowty's and Jackendoff's theories. There is, however, an important difference in the treatment of thematic relations within those two frameworks. In Case Grammar, they have a double function, namely, they serve as a partial semantic representation of the lexical meaning and also as an input to the syntactic operations such as for example subjectivization, objectivization and raising. In the latter, the RRG model, thematic relations have only the second function.

This difference highlights the problem of selection of semantic content in NLP lexicons. The following arguments might be posed in favour of the variant which includes the information on a partial semantic representation in the lexicon: (i) significant generalizations about lexical meaning of verbs are accounted for in an explicit way in the verb entry; (ii) such information can support disambiguation of senses in case of polysemy; (iii) as a consequence of (ii), the semantic correctness of verb classifications increases, which in turn can improve the results of syntactic and semantic rules that operate on verb classes; (iv) it can also contribute to the integration of semantic and syntactic content in the lexicon.

There is no doubt that the model of semantic roles from the seventies, and in particular its repertory of roles and definitions, has to be replaced by a more stringent semantic model to suit the needs of NLP. The combination of the Dowty [Dow89] model of protoroles with the model of thematic sorts proposed by Poznansky and Sanfilippo [San92a] and elaborated in Sanfilippo [San93b] seems to be a very interesting proposal or solution (cf. 2.6.2 for description of these models).

Let's finish these general remarks with a quotation from [Lon76], which captures the essentials of verb classification w.r.t. semantic roles. "An understanding of the function of cases or roles is insightful for the understanding of language. Even more insightful, however, is the grouping of these roles with verb types with which they characteristically occur. To do this we must specify features, which distinguish one set of the verbs from another set of verbs, then we must specify the roles that occur with verbs characterised by these features. The result will be sets of verbs with characteristic constellations of accompanying substantives in given role...Such a set of verbs with characteristic accompanying nouns in particular roles is called a case frame. To assemble and compare the case frames of a language is to evolve a semantic typology or classification of its verbs... As soon as one begins to assemble a number of case frames, similarities in sets of case frames begin to be evident. This leads to the feeling that case frames should constitute some sort of system, i.e. that they are not merely list or inventory, but a system with intersecting parameters."

Chafe's basic verb types Characteristic for Chafe's approach is the position that a sentence is build around a predicative element, usually the verb. The nature of the verb determines what nouns will accompany it, what the relation of these nouns to it will be and how these nouns will be semantically specified. For example if the verb is specified as an action, as in The men laughed, such a verb dictates that the noun to be related is agent which might be further specified as animate.

Chafe distinguished four basic verb types: states, processes, actions and action processes. State verbs describe the state or condition of a single argument (The elephant is dead) and they associate with Patient. Non-state verbs are subdivided into three subclasses: processes, action and action-processes. Processes express a change of condition or state in its argument (The elephant died). They cooccur with Patients. Actions describe something that verb argument does or performs (Harriet sang), hence Agent is associated with action verbs. Action-processes account for both actions and processes. They have two arguments, the performer of the action, Agent, and the thing undergoing the process, Patient (The tiger killed the elephant). Chafe offers a number of semantico-syntactic tests that are indicative in distinguishing the basic verb types. The following relations are discussed in Chafe's model: Agent, Patient, Experiencer, Beneficiary, Complement, Locative and Instrument. He also draws attention to the "derivational" relations between these four basic verb types, which enable these to be established by postulating features like inchoative, causative, decausative, resultative. Thus, for example, the feature "inchoative" when added to a state gives a process. These derivational features, which often can be manifested morphologically, reflect the compositionality of verb meaning.

Cook's verb classification

Cook's case grammar matrix is a system based on two parameters. The vertical parameter has four values: state verbs, process verbs, action verbs and action processes, taken from Chafe [Cha70]. The other parameter has also four values: either with no further nuclear role added (e.g. only agent (A) and/or patient (P)), or with experiencer (E), benefactive (B) or locative (L) added as further nuclear elements. The content in Cook's matrix is presented below. (After Cook [Coo79] pp. 63-65.)

A. The four basic verb types

B. The four experiential verb types

C. The four benefactive verb types

D. The four locative verb types

Longacre's verb classification

Longacre extended the number of nuclear cases to 10, which resulted in a considerable enlargement of the number of verb classes. His scheme of case frames reminds one of that of the periodic chart of the chemical elements. The horizontal parameter accounts for the four basic verb types. The vertical parameter covers verb classes specified below and marked with letters (a) to (h). The following thematic roles were posed as nuclear by Longacre: Experiencer (E), Patient (P), Agent (A), Range (R), Measure (M), Instrument (I), Locative (L), Source (S), Goal (G), Path (Pa), Time. Manner and Accompaniment were considered peripheral roles. The verb classes specified in Longacre's scheme include following (for the complete exemplification (see [Lon76] pp. 44-9):

In Longacre's frame scheme there are 45 filled cells with the total of 48 case frames. It is characterised by an overall regularity with some spots of local irregularity. Longacre observes that rows (a-d') may have Experiencer but not Patient while rows (e-h') can have Patient but not Experiencer. This plus the distribution of Agent in the columns describing action-processes and actions show some major verb classes in the case frame approach.

Role and Reference Grammar (RRG)

The Role and Reference Grammar (RRG) has some characteristics in common with the models discussed above. The theory of verb classes occupies a central position in the system of lexical representation in RRG. The verb is also assumed to be a determining element in the nucleus predication. RRG starts with the Vendler [Ven68] classification of verbs into states (e.g. have, know, believe), achievements (e.g. die, realise, learn), accomplishments (e.g. give, teach, kill) and activities (e.g. swim, walk, talk). It utilises a modified version of the representational scheme proposed in Dowty [Dow89] to capture the distinctions between these verb classes.

Dowty explains the differences between the verb classes in terms of lexical decomposition system in which stative predicates (e.g. know, be, have) are taken as basic and other classes are derived from them. Thus achievements which are inchoative semantically are treated as states plus a BECOME operator, e.g. BECOME know' "learn". Accomplishments which are inherently causative are represented by the operator CAUSE linked to the achievements operator BECOME, e.g. CAUSE [BECOME know'] "teach". Activities are marked by the operator DO for agentive verbs. These decompositional forms are termed Logical Structures (LS) by Dowty. In RRG they are interpreted in the following way: (after Van Valin [VanV93], p.36)

Verb Class Logical Structure
STATE predicate'(x) or (x,y)
ACHIEVEMENT BECOME predicate' (x) or (x,y)
ACTIVITY (+/- Agentive) (DO (x) [ predicate' (x) or (x,y)])
ACCOMPLISHMENT $\phi$ CAUSE $\psi$, where $\phi$ is normally an
activity predicate and $\psi$ an achievement predicate.

These LSs are starting point for the interpretation of the thematic relations in RRG. Thematic relations are defined in terms of argument positions in the decomposed LS representations, following Jackendoff [Jac90]. Their repository and distribution among the verb classes is presented in the table below (after Van Valin [VanV93]. p.39):

A. Locational be-at' (x,y) x=locative, y=theme  
B. Non-Locational      
1. State or condition broken'(x) x=patient  
2. Perception see'(x,y) x=experiencer, y=theme  
3. Cognition believe'(x,y) x=experiencer, y=theme  
4. Possession have'(x,y) x=locative, y=theme  
5. Equational be'(x,y) x=locative, y=theme  
A. Uncontrolled      
1. Non-motion cry'(x,y) x=effector, y=locative  
2. Motion roll'(x) x=theme  
B. Controlled DO (x, [ cry'(x)]) x=agent  


Patient is associated with the single argument of a single-argument stative verb of state or condition, as in The watch is broken. Theme is the second argument of two place stative verbs, e.g. the magazine in The magazine is on the desk. the desk is a locative, the first argument of two-place locational stative verbs. Experiencer is the first argument with a two place stative perception verbs. The single argument of a motion activity verb is a theme, as it undergoes a change of location: The ball rolled. The first argument of a non-motion activity verb is an effector, the participant which does some action and which is unmarked for volition and control as in The door squeaks. Such interpretation of thematic roles leads to the conclusion that the thematic roles in RRG are independently motivated.

LS and thematic roles are part of the semantic representation in RRG. Thematic roles function as one of the links between the semantic and syntactic representation. Semantic macroroles, Actor and Undergoer are the other link. Macroroles conceptually parallel the grammatical notions of arguments in a transitive predication. Being an immediate link to the level of Syntactic Functions, they control the assignment of syntactic markers to the arguments of the verb. It should be noted that the rich delineation of the lexical representations in the RRG model is well suited for the description of typologically different languages.

The classes of verbs in the table above cover different cognitive dimensions of language. The main cognitive distinction is drawn between two conceptual categories such as State and Activity. State verbs are subclassified into two major classes comprising locational and non-locational verbs. Among the non-locational verbs, the following subclasses are distinguished: state or condition, perception, cognition, possession and equational verbs. Activity verbs are subdivided with respect to the control component. Uncontrolled verbs are further subclassified with respect to the motion component.

Comparing approaches in Case Grammar oriented models and RRG

A common characteristic for the approaches to the classifications of verbs sketched in this subsection is that they search for a subset of recurrent semantic components and semantic roles that are relevant for the description of thematic relations. The two approaches reveal some interesting parallels concerning the decompositional analysis of verb meanings with regard to the subclassification of verbs into more or less equivalent types; thus states = states, activity = action, achievement = process, accomplishment = action-process. Since the LSs in the RRG model correspond to the thematic relations that other theories associate with a verb in their lexical entry, there is some partial similarity that the classifications of verbs within the two frameworks share. These two frameworks also show some overlap as far as the semantic affinity of the major subclasses of verbs is concerned.

The frameworks differ with regard to (a) the choice of description model, e.g. hierarchy vs. matrix model, (b) the level of semantic granularity in the subclassification of verbs, (c) the function that thematic relations play in the semantic representation in the respective frameworks.

The issues addressed within these frameworks turn attention, in the first place, to some basic linguistic questions that have to be answered when approaching the description and formalization of the lexical meaning in lexicons designed for both general and NLP purposes. The classification of verbs w.r.t. thematic relations should be seen as a preparatory stage that aims at a partial semantic representation of the lexical meaning of verbs. It has to be well adjusted to the chosen model of the semantic representation, which in turn has to be integrated with the model of syntactic representation.

LCS-Based Verb Classification

Let us now introduce the Lexical Conceptual Structure (LCS), which is an elaborated form of semantic representation, with a strong cognitive dimension. The LCS came in part from the Lexical Semantics Templates (see above) and from a large number of observations such as those of [Gru67]. The present form of the LCS, under which it gained its popularity, is due to Jackendoff [Jac83], [Jac90]. The LCS was designed within a linguistic and cognitive perspective, it has some similarities, but also major differences, with approaches closer to Artificial Intelligence such as semantic nets or conceptual graphs. The LCS is basically designed to represent the meaning of predicative elements and the semantics of propositions, it is therefore substantially different from frames and scripts, which describe situations in the world like going to a restaurant or been cured from a desease. It is not specifically oriented toward communication acts or toward the representation of abstract objects of the world (by means of e.g. state of affairs, complex indeterminates), represented as objects, as in Situation Semantics (e.g. a complex indeterminate to model a person who utters an sentence).

Main Principles and characteristics

The LCS is mainly organized around the notion of motion, other semantic/cognitive fields being derived from motion by analogy (e.g. change of possession, change of property). This analogy is fine in a number of cases, as shall be seen below, but turns out to be unnatural in a number of others. ¿From that point of view, the LCS should be considered both as a semantic model providing a representational framework and a language of primitives on the one hand, and as a methodology on the other hand, allowing for the introduction of new primitives to the language, whenever justified. Another important characteristics of the LCS is the close relations it has with syntax, allowing the implementation of a comprehensive system of semantic composition rules. ¿From that point of view one often compares the LCS with a kind of X-bar semantics.

Let us now introduce the different elements of the LCS language. They are mainly: conceptual categories, semantic fields and primitives. Other elements are conceptual variables, semantic features (similar to selectional restrictions, e.g. such as eatable entity, liquid), constants (representing non- decomposable concepts like e.g. money, butter) and lexical functions (which play minor roles). [Pin89] also introduces relations between events (e.g. means, effect). These latter elements are not presented formally, but by means of examples.

Conceptual Categories

[Jac83] introduces the notion of conceptual constituent defined from a small set of ontological categories (also called conceptual parts of speech), among which the most important are: thing, event, state, place, path, property, purpose, manner, amount, time. These categories may subsume more specific ones, e.g. the category thing subsumes: human, animal, object. These categories may be viewed as the roots of a selectional restriction system.

The assignment of a conceptual category to a lexical item often depends on its context of utterance, for example the noun meeting is assigned the category time in:
after the meeting
while it is assigned the category event in:
the meeting will be held at noon in room B34.
There are constraints on the types of conceptual categories which can be assigned to a lexical item. For example, a color will never be assigned categories such as event or distance.

Conceptual categories are represented as an indice to a bracketed structure:
[<semantic   category>     ]
where the contents of that structure has the type denoted by the semantic category. Here are a few syntactic realizations of conceptual categories:
[thing Mozart ] is [property famous ].
He composed
[amount many [thing symphonies ]].
[event The meeting ] starts at [time 2 PM ].
Ann switched off the electricity
[purpose to prevent a fire ].
The light is
[state red ].
[event Edith begins a new program ].

Conceptual primitives

The LCS is based on a small number of conceptual primitives. The main ones are BE, which represents a state, and GO, which represents any event. Other primitives include: STAY (a BE with an idea of duration), CAUSE (for expressing causality), INCH (for inchoative interpretations of events), EXT (spatial extension along something), REACT, EXCH (exchange), ORIENT (orientation of an object), etc. Their number remains small, while covering a quite large number of concepts. A second set of primitives, slightly larger (about 50) describes prepositions: AT, IN, ON, TOWARD, FROM, TO, BEHIND, UNDER, VIA, etc. These primitives are `lower' in the primitive hierarchy and their number is a priori fixed once for all.

The LCS uses some principles put forward in [Gru67], namely that the primitives used to represent concepts of localization and movement can be transposed to other fields by analogy, and generalized. The main fields considered in the LCS are the following: localization (+loc), time (+temp), possession (+poss) and expression of characteristics of an entity, its properties (+char,+ident) or its material composition (+char,+comp). [Pin89] introduces additional fields such as: epistemic (+epist) and psychological (+psy).

Primitives can then be specialized to a field, e.g. GO+loc describes a change of location, GO+temp a change of time, GO+poss a change of possession, and GO+char,+ident a change in the value of a property (e.g. weight, color).

Verb classes based on LCS patterns

The LCS allows us to classify verbs at different levels of granularity and according to different conceptual dimensions:

the state / event distinction

the primitives they are built from
primitive root of the representation: GO, BE, CAUSE,

the semantic fields marking that primitive,

the different arguments of the verb and their semantic fields.

It is clear that classifications must be based on generic LCS patterns. A verb belongs to the class associated with a pattern if and only if its representation is subsumed by that LCS pattern (LCS patterns can be viewed as types). This classification method, based on a conceptual representation has some similarities with classifications based on semantic roles (REF), since it has been shown that LCS patterns have strong relations (and are even more precise) than semantic roles. It also allows us to introduce prepositions into verb meaning (REF) and to somewhat characterize in extension the elxical patterns associated with PPs (REF).

Here are now a few examples, constructed for French verbs. As can be noted the classes formed from LCS patterns are substantially different in nature and form from those obtained from syntactic of thematic criteria:

Verbs of spatial motion:
LCS pattern: [event GO +loc ([thing ], [path ] )
{aller, venir, partir, sortir, entrer, arriver, amener, déplacer, se rendre, s'amener, marcher, commander, livrer, approcher, avancer, mettre, apparaitre, survenir, quitter, bouger, poser, dissiper, extraire, monter, descendre, pénétrer ...}

Verbs of transfer of possession:
LCS pattern: [event GO+poss ([thing ], [path ])] (a CAUSE may be added) examples:
{donner, prendre, voler, adresser, acquérir, alimenter, apprendre, cambrioler, allouer, offrir, prodiguer, retenir, consacrer, acheter, vendre, céder, fournir, livrer, échanger, troquer, abandonner, sacrifier, confier, procurer, apporter, remettre, porter, distribuer, rendre, octroyer, abandonner, dilapider, perdre, ... }

Verbs of temporal 'motion':
LCS pattern: [event GO +temp ([thing ], [path ])]
{retarder, déplacer, avancer, ajourner, remettre, reporter, attarder, différer, repousser, anticiper, ... }

Verbs of change of state:
LCS pattern: [event GO +char,+ident([thing ], [path ])] (a CAUSE may also be added to that pattern).
{devenir, changer, évoluer, régresser, se modifier, se transformer, progresser, varier, diversifier, casser, altérer, aliéner, détériorer, détruire, construire ... }

Verbs of persistance of a state:
LCS pattern: [ event STAY+char, +ident ([thing ], [place ])]
{maintenir, rester, persister, laisser, fixer, arrêter, immobiliser, entretenir, stabiliser, geler, figer, paralyser, pétrifier, ... }

Verbs of possession (state):
LCS pattern: [state BE+poss ([thing ], [place ])]
{avoir, posséder, détenir, bénéficier, jouir, disposer, ... }

Verbs of spatial extension:
LCS pattern: [state EXT+loc([thing ], [place ])]
{s'étendre, longer, côtoyer, raser, border, s'étaler, caboter, entourer, prolonger, ... }

Verbs of temporal extension:
LCS pattern: [state EXT+temp([thing ], [place ])]
examples: { durer, immortaliser, éterniser, perpétuer, prolonger, ... }

Comparisons between approaches

It is quite difficult to compare the three above approaches. They are based on very different assumptions. We can however indicate that classes constructed on syntactic criteria are of much interest from a theoretical point of view, in the study of the cooperation between syntax and semantics. They are certainly less useful in LKB design since they are far from complete and include many forms of exceptions.

The approach based on general semantic criteria is much more concrete and applicable. However, the classes which can be formed on this basis remain very general. Classes formed using the ontological criteria of WordNet, from that point of view, are more fine-grained, and they should be prefered (see section on WordNet 3.4). The LCS-based classification is also fine-grained, its main advantage is to base the classification on semantic frames, which can then be used for semantic representation. Frames may be felt to be more arbitrary, but, in fact, they share a lot with the previous classification method. LCS representations are more formal, they allow thus more precise classifications, but they may also be felt to be incomplete (because of their formalism based entirely on predicates and functions) and to be difficult to establish.

Relations with other areas of lexical semantics

It is clear that verbs is one of the most central syntactic category in language. They have deep relations with the other categories: nouns because they select arguments which are often nominals, adverbs because adverbs modify verbs, prepositions, since they introduce PPs. Verbs assign thematic roles to their arguments and to prepositions, which, in turn assign thematic roles to NPs. Verbs associated with adverbs permit the computation of aspect.

Verb semantic classes in LKB

For the same reasons as above, verbs have a priviledge position in LKBs. The following resources have developed a quite extensive description of verbs, described in the next chapter of this report: 3.6, 3.9, 3.10, 3.10.2.

Verb semantic classes in Applications

Verbs are central in many applications, in particular in Machine Translations 4.1, 4.1.3. They are now becoming of much interest in automatic indexing 4.3, 4.2.3 and information retrieval 4.2, 4.2.3 where indexes, previously based key-words, are now formed of verbs and arguments, in order to describes concepts but also actions or states. Projects of much interest from that point of view are reported in chapter 4.

next up previous contents
Next: Nouns, Nominalizations and Noun Up: Linguistic aspects of lexical Previous: Lexicalization
EAGLES Central Secretariat