Next: Adjectives Up: Linguistic aspects of lexical Previous: Verb Semantic Classes

Nouns, Nominalizations and Noun Phrases

In this section we address the basic requirements for encoding the lexical properties of nominals and nominalizations. Although nouns may not seem to encode as much information as verbs, their contribution to the overall interpretation of a sentence is very significant for the following reasons:

1.

Nominals contribute towards determining implicit information in the form of ellipsed predicates:

(2)

a.: TYPE COERCION
b.: John enjoyed the book.
``John enjoyed some event that can be reconstructed from the lexical item `book'.''
Minimally: read, write.
c.: John began the movie.
``John began some event that can be reconstructed from the lexical item `movie'.''
Minimally: show, produce.

2.

Similarly to (1.) nominals provide event-based information:

(3)

a.: ADJECTIVAL MODIFICATION
b.: an occasional cup of coffee/cigarette.
``a cup of coffee that someone drinks occasionally.''
c.: a quick beer.
``a beer to be drunk quickly.''
d.: a long record.
``a record that plays for a long time.''

3.

Nominals are responsible for the polysemous behavior of verbal predicates:

(4)

a.: ASPECTUAL SHIFTS
b.: John painted a beautiful portrait. (create)
c.: John painted the outside wall. (change of state)

(5)

a.: SENSE SHIFTS
b.: John uses aspirin. (i.e. to feel better)
c.: John uses the subway. (i.e. to travel)
d.: John uses e-mail. (i.e. to write)

The study of nominal forms in linguistics has usually been carried out within distinct approaches and traditions focussing on different properties of nouns and noun phrases. These relfect two main issues surrounding the analysis of nominals:

1.: the study of their structural properties;
2.: the study of their componential properties.

The structural properties of nominals

The study of the structural properties of nominal forms has usually been of great concern in syntax, where the aim was that of developping abstract representations that would apply cross-categorially (cf., Chomsky, 1970, Jackendoff, 1977). One of the keys to this aspect concerns the observation that nominals have relational properties and similarly to verbs they carry argument structure. The argument structure of nominal forms comes in a variety of ways. Relational nouns obligatorily require their arguments to be expressed: father, neighbour, wife, side, part, etc. There are other nominals, that are not underlyingly relational, but which license arguments:

(6)

a.: the baker of the cake;
b.: the owner of the red car;

(7)

a.: the book of poetry;
b.: the door to the kitchen;

(8)

a.: the war between England and France;
b.: the meeting between the diplomats;

The interpretation of the participants mirrors very closely the interpretations of semantic roles for verbal predicates as shown especially with nominalized forms:

(9)

a.: the attack of the troops ( AGENT) the acquisition of the stocks ( THEME) the destruction of the city ( PATIENT

One important distinction between the arguments of nouns and verbs is that they behave differently with respect to expressibility conditions: with nouns arguments are optional, with verbs they are obligatory, (cf., Grimshaw, 1990, Van Valin and LaPolla, 1997):

(10)

a.: The enemy destroyed the city.
b.: *The enemy destroyed.

(11)

a.: the destruction of the city by the enemy;
b.: the destruction of the city;
c.: the destruction;

The underlying lexical semantic representation of a given nominal is responsible for accounting for the expressibility in syntax, which is the topic of the next section.

Componential Aspect

The problem of how to characterize the semantic contribution of nominals has always been at the heart of the studies in lexical semantics. Minimal distinctions made between nominals concern their semantic classification which provides the basis for typing appropriately the lexical items.

The most studied semantic distinction between nominal types is the count versus mass distinction (Pelletier and Shubert, 1989) which captures the different behavior of nouns with respect to quantifiers:

(12)

a.: Much water
b.: *Much house

(13)

a.: *Many water
b.: Many houses

(14)

a.: There was water all over.
b.: *There was house all over.

In addition to this type of information, decompositional approaches to the representation of nouns usually assign a set of features which drive the inferences associated with a given nominal. This approach is advocated in the work of Katz and Fodor (1963), in Generative Semantics, in AI research (Shank, Wilks, 1977) as well as in a number of recent approaches to lexical semantics. The componential analysis of a nouns such as bachelor would involve (at least) the following features:

(15): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...}}}\\ + HUMAN \\ + MALE \\ NOT-MARRIED \\ [-2mm] \\ \end{array} \right] }$

More recent work on the lexicon has been aimed towards systematizing the methodology for representing the internal semantic structure of nominal forms. The Generative Lexicon approach developped in Pustejovsky (1991, 1995) employs the theory of qualia structure, which expresses the set of semantic relations characterizing the meaning of words. The qualia structure for nominal forms is summarized below:

(16): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...\\ [-2mm] \\ \end{array} \right] }}}} \\ [-2mm] \\ \end{array} \right] }$

A nominal such as knife obtains the following representation:

(17): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...\ [-2mm] \\ \end{array} \right] }}}}% \\ [-2mm] \\ \end{array} \right] }$

Polysemous nominals

Ambiguity and polysemy of nominal forms represent an important concern which affect the organization of word meaning. The basic distinction between what Weinreich termed contrastive and complementary ambiguity should involve different solutions for the representation of lexical knowledge. Contrastive ambiguity, as manifested by words such as bank (financial institution), and shore is handled by multiple representations:

(18): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...}}\\ + COUNT \\ + FINANCIAL INSTITUTION \\ [-2mm] \\ \end{array} \right] }$

(19): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ... {\bf bank$_2$ }}}\\ + COUNT \\ + SHORE \\ [-2mm] \\ \end{array} \right] }$

The nominal bank₁, however, displays a polysemy which is common to company-like entities. This nominal may also have a location (building) interpretation. In this case the polysemy emerges as the result of the structure of the concept, where the location is part of the constitution of an organization. For this purpose a qualia-based representation provides the appropriate organizing principle for expressing this type of knowledge:

(20): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ... {{\mbox{\small {\bf has(x,location)}}}}} \\ [-2mm] \\ \end{array} \right] }$

Relation to other areas of lexical semantics

The semantics of nominals bears directly on a number of areas in semantics and lexical semantics:

the composition with adjectives;
the interpretation of nominal compounds;
their role in lexicalization processes;
their relation with aspect;
their interaction with quantification.

Nouns and Ontologies

The theoretical issues discussed here are to a large extent complementary to the work carried out by the ANSI Ad Hoc Ontology Standards Group (ANSI X3T2), which operates in the US. The ANSI Committee meets since March 1996 and has brought together researchers and developers of ontologies from a variety of disciplines and institutes. The final aim is the development of a standardized Reference Ontology of general and fundamental 10,000 concepts, based on the major concepts of a variety of existing resources, such as CYC 3.7.2, Pangloss-Sensus 3.7.5, Pennman 3.7.4 , MikroKosmos 3.7.3, EDR 3.6, WordNet1.5 3.4.2. A description of the work done is given in [HovFC]. The main focus of their effort is not the lexicon but ontologies as a separate object. Nevertheless, many notions and issues discussed here are relevant to both lexical and non-lexical semantic resources.

Compounds

Nominal compounds represent a large part of our vocabulary and involve a great deal of creativity. For this reason they have received a great deal of attention in linguistic research, ([Berg91], [Jes42], [Mar69], [Lee70], [Dow77], [Levi78], [War87]) and in computational linguistics, where their analysis has presented a serious challenge for natural language processing systems ([Fin80], [McD82], [Isa84], Als87, [Hob93], [Boui92], [Job95], [Bus96b],[Cop97]). The reason for such interest is that the interpretation of Noun-Noun compounds poses serious problems. The strategy for forming English Noun-Noun compound involves combining a modifier noun and a head noun, without making explicit the semantic relation between compound members. The main difficulty liese in recovering the implicit semantic information holding by the nouns, which varies greatly:

(21)

a.: coffee cup
b.: coffee producer
c.: coffee grinder
d.: coffee creamer
e.: coffee machine

When looking at cross-linguistic data complex nominals require systematic solutions. In Italian, for example, complex nominals are generally comprised of a head followed by a preposition and a modifying noun. Consider the correspondences below in (22).

(22)

a.: bread knife coltello da pane
b.: wine glass bicchiere da vino
c.: lemon juice succo di limone
d.: glass door porta a vetri

The analysis in ([Levi78], [Job95]), relies on the enumeration of possible semantic relations observed in the language. A number of proposal exploiting generative models are basing the composition on the qualia structure, which provides the `glue' which links together the semantic contributions of modifying nouns and the head noun in the compound ([Joh95], [Bus96b], [Joh98], [Fab96]).

Lexicalization

Lexicalization is concerned with the linguistic components that are responsible for the constructions allowed by a word itself. Consider an example of communicating that there was a shooting event. This can be done in many ways as shown below:

(23)

a.: The man shot the president twice.
b.: The killer shot the president twice.
c.: The man shot the president dead.
d.: ??The killer shot the president dead.

The sentences above communicate different outcomes of the event of shooting the president. Whereas the sentence in (23b) strongly conveys the information that the president died, the one in (23a) does not. What is clearly responsible for the different entailments is the lexical content differentiating man from killer.

Events

The role of events in the representation of nominals constitutes a rather controversial topic. In general, the only nouns that are taken to contain events in their representation are those that express an event directly. For instance, nouns such as war, destruction, and so on. Within generative models, and more specifically Generative Lexicon theory, all lexical items carry event based information which is crucially involved in the compositional processes involving: type coercion, adjectival modification, event nominalizations and, among others, also agentive nominals.

Quantification

Quantification within the semantics of nominals is a very broad and difficult topic. Within Formal Models the type of entity referred to determines a range of quantificational phenomena:

1. quantification

(24)

a.: a policeman/ two/some policemen (singleton and multiform set of objects)
b.: some/two police (multiform set of objects)
c.: a squadron/ two/some squadrons (singleton and multiform set of groups)
d.: some/much traffic (non-discrete collection of objects)
e.: some/ much water (non-discrete amount of substance)
f.: three medicines(the number of medicine-types is counted)

Ordinary count nouns can take discrete quantifiers to form multiform sets, (cf., (24a)). In the case of group nouns, (24c), groups are counted and not the members of the group. In the case of collectives, e.g. (24d), the number of elements is measured and not counted. Genuine substances as in (24e) represent measurements both as referential entities and as conceptual units. Finally, we see a case of quantification of entity types in (24f). The diversity is quantified, nothing is said about the amount of medicine.

2. predication distribution

(25)

a.: the police/squadron/policemen carried new equipment (distributive)
b.: The policemen carried the piano upstairs (collective )
c.: The family loves each other (reciprocal )
d.: The committee meets (multiplicity of subject)
e.: two big healthy families/boys (different property distributions )

In (25a) we see that the multiplicity (set, group or collective) of the subject and the object explains that we can have distributive, collective and cumulative interpretations: each persons carries a single set of equipment, they all carry one set of equipment, any combination is possible. In (25b) the fact that a piano is a single ad big individuated object suggests that the subject carries collectively. In the next two examples we see that the multiplicity of the subject is a necessary selection criterion for the reciprocal interpretation of the predicate. The final example shows that group nouns exhibit two different entity levels for quantification and property distribution.

3. agreement

(26)

a.: The band played/plays well
b.: The committee met/meets

The conditions determining agreement vary a lot across languages. In the English examples, we see that quantificational semantic properties can play a role. Here we see that the singular subjects can govern a plural verb when interpreted as a multiform group noun. See also [Cop95] for a discussion of group nouns.

How is information encoded in lexical databases

There is a fairly large number of treatments which will be surveyed by the next meeting. These include 3.7, Ontos (Nirenburg), Penman (3.7.4).

Acquilex

Nouns in Acquilex are represented using Typed Feature Structures in an HPSG- tradition, encoding the following types of information:

CAT:: morpho-syntactic properties such as number, gender, person, countability
SEM:: formal semantic properties such as set; mass
QUALIA:: conceptual semantic properties

The QUALIA specification roughly corresponds to the Generative lexicon approach. The SEM specification incorporates formal semantic notions on entity types. These are correlated with the QUALIA properties. These correlations predict collective and distributive property distributions for the different multiform nouns (plural, group, collective) as discussed above. This is illustrated by the following feature structures:

(27): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...\ [-2mm] \\ \end{array} \right] }}}}% \\ [-2mm] \\ \end{array} \right] }$

(28): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...\ [-2mm] \\ \end{array} \right] }}}}% \\ [-2mm] \\ \end{array} \right] }$

(29): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...\ [-2mm] \\ \end{array} \right] }}}}% \\ [-2mm] \\ \end{array} \right] }$

(30): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...\ [-2mm] \\ \end{array} \right] }}}}% \\ [-2mm] \\ \end{array} \right] }$

(31): ${\setlength{\arraycolsep}{0.8mm} \renewcommand {1.2}{1.2} \left[ \begin{arr... ...\ [-2mm] \\ \end{array} \right] }}}}% \\ [-2mm] \\ \end{array} \right] }$

How is the information used in LE applications

In principle, basic semantic networks and taxonomies such as wordnet can directly be used in various applications. There is however a direct correlation between the amount of detail in semantic information that is encoded and the level of interpretation and information processing that can be achieved.

Natural Language Generation

Natural language generation (4.5) faces the serious problem of connecting a discourse plan with the linguistic module that is responsible for the surface realization. Nouns semantics is most relevant for lexical choice, namely selecting a given lexical item that packages the most appropriate information which is best suited for communicating a given meaning. g.

Lexical Representations for Information Extraction

A system may benefit from a way of distinguishing the different profiles of the individuals in a text, e.g. rebels versus army (cf. [Hob91]). In discussing some of the higher-rated mistakes in information extraction based on abductive inference -- another knowledge-intensive method of NLP -- [Hob91] points to cases of missing axioms (or lexical knowledge) such as knowing that a bishop is a profession.

Next: Adjectives Up: Linguistic aspects of lexical Previous: Verb Semantic Classes

EAGLES Central Secretariat eagles@ilc.cnr.it