Up: Linguistic aspects of lexical
Previous: Verb Semantic Classes
In this section we
address the basic requirements for encoding the lexical properties of nominals
and nominalizations. Although nouns may not seem to encode as much
information as verbs,
their contribution to the overall interpretation of a sentence is very
significant for the
Nouns, Nominalizations and Noun Phrases
The study of nominal forms in linguistics has usually been carried out within
distinct approaches and traditions focussing on different properties of
nouns and noun
These relfect two main issues surrounding the analysis of nominals:
- Nominals contribute towards determining implicit information in
the form of ellipsed
- TYPE COERCION
John enjoyed the book.
some event that can be reconstructed from the
lexical item `book'.''
Minimally: read, write.
- John began the movie.
``John began some event that
can be reconstructed from the lexical item `movie'.''
Minimally: show, produce.
- Similarly to (1.) nominals provide event-based information:
- ADJECTIVAL MODIFICATION
- an occasional cup of
``a cup of coffee that someone
- a quick beer.
``a beer to be drunk quickly.''
- a long record.
``a record that plays for a long time.''
- Nominals are responsible for the polysemous behavior of verbal
- ASPECTUAL SHIFTS
- John painted a beautiful portrait. (create)
- John painted the outside wall. (change of state)
- SENSE SHIFTS
- John uses aspirin. (i.e. to feel better)
- John uses the subway. (i.e. to travel)
- John uses e-mail. (i.e. to write)
- the study of their structural properties;
- the study of their componential properties.
The structural properties of nominals
The study of the structural properties of nominal forms has usually been
of great concern in syntax, where the aim was that of developping abstract
representations that would apply cross-categorially (cf., Chomsky, 1970,
One of the keys to this aspect concerns the observation that nominals have
and similarly to verbs they carry argument structure. The argument
structure of nominal forms
comes in a variety of ways. Relational nouns obligatorily require their
arguments to be expressed: father, neighbour, wife, side, part, etc.
There are other nominals, that are not underlyingly relational, but which
- the baker of the cake;
- the owner of the red car;
- the book of poetry;
- the door to the kitchen;
- the war between England and France;
- the meeting between the diplomats;
The interpretation of the participants mirrors very closely the
semantic roles for verbal predicates as shown especially with nominalized
- the attack of the troops ( AGENT)
the acquisition of the stocks (
the destruction of the city (
One important distinction between the arguments of nouns and verbs is that
differently with respect to expressibility conditions: with nouns arguments
with verbs they are obligatory, (cf., Grimshaw, 1990, Van Valin and
- The enemy destroyed the city.
- *The enemy destroyed.
- the destruction of the city by the enemy;
- the destruction of the city;
- the destruction;
The underlying lexical semantic representation of a given nominal is
responsible for accounting
for the expressibility in syntax, which is the topic of the next section.
The problem of how to characterize the semantic contribution of
nominals has always been at the heart of the studies in lexical semantics.
Minimal distinctions made between nominals concern their semantic
which provides the basis for typing appropriately the lexical items.
The most studied semantic distinction between nominal types is the count versus mass
distinction (Pelletier and Shubert, 1989) which captures the different
behavior of nouns with
respect to quantifiers:
- Much water
- *Much house
- *Many water
- Many houses
- There was water all over.
- *There was house all over.
In addition to this type of information, decompositional approaches to the
representation of nouns
usually assign a set of features which drive the inferences associated with
a given nominal. This approach is advocated in the work of Katz and
Fodor (1963), in Generative Semantics, in AI research (Shank, Wilks, 1977)
as well as in a
number of recent approaches to lexical semantics. The componential analysis
of a nouns such as
bachelor would involve (at least) the following features:
More recent work on the lexicon has been aimed towards systematizing the
for representing the internal semantic structure of nominal forms. The
Lexicon approach developped in Pustejovsky (1991, 1995) employs the
theory of qualia
structure, which expresses the set of semantic relations characterizing the
of words. The qualia structure for nominal forms is summarized below:
A nominal such as knife obtains the following representation:
Ambiguity and polysemy of nominal forms represent an important concern
which affect the organization of word meaning.
The basic distinction between what Weinreich termed contrastive and
should involve different solutions for the representation of lexical
ambiguity, as manifested by words such as bank (financial
institution), and shore is
handled by multiple representations:
The nominal bank1, however, displays a polysemy which is common to
entities. This nominal may also have a location (building) interpretation.
In this case the
polysemy emerges as the result of the structure of the concept, where the
location is part of
the constitution of an organization. For this purpose a qualia-based
the appropriate organizing principle for expressing this type of knowledge:
The semantics of nominals bears directly on a number of areas in
semantics and lexical semantics:
Relation to other areas of lexical semantics
the interpretation of nominal
their role in lexicalization processes;
their relation with aspect;
The theoretical issues discussed here are to a large extent complementary
to the work carried out by the ANSI Ad Hoc Ontology Standards Group (ANSI
X3T2), which operates in the US. The ANSI Committee meets since March 1996
and has brought together researchers and developers of ontologies from a
variety of disciplines and institutes. The final aim is the development of
a standardized Reference Ontology of general and fundamental 10,000
concepts, based on the major concepts of a variety of existing resources,
such as CYC 3.7.2, Pangloss-Sensus 3.7.5, Pennman 3.7.4 , MikroKosmos 3.7.3, EDR 3.6, WordNet1.5 3.4.2. A description of the work done is given in [HovFC]. The
main focus of their effort is not the lexicon but ontologies as a separate
object. Nevertheless, many notions and issues discussed here are relevant
to both lexical and non-lexical semantic resources.
Nominal compounds represent a large part of our vocabulary and involve a great
deal of creativity. For this reason they have received a great deal of
attention in linguistic
research, ([Berg91], [Jes42], [Mar69], [Lee70],
[Levi78], [War87]) and in computational linguistics, where
their analysis has presented a serious challenge for natural language
systems ([Fin80], [McD82], [Isa84], Als87,
[Hob93], [Boui92], [Job95],
[Bus96b],[Cop97]). The reason for such interest is that the
interpretation of Noun-Noun compounds poses serious problems. The strategy for
forming English Noun-Noun compound involves combining a modifier noun and a
noun, without making explicit the semantic relation between compound members.
The main difficulty liese in recovering the implicit semantic information
holding by the nouns, which varies greatly:
When looking at cross-linguistic data complex nominals require
systematic solutions. In Italian, for example, complex nominals are
generally comprised of a head followed by a preposition and a
modifying noun. Consider the correspondences below in (22).
- coffee cup
- coffee producer
- coffee grinder
- coffee creamer
- coffee machine
The analysis in ([Levi78], [Job95]), relies on the
enumeration of possible semantic relations observed in the language.
A number of proposal exploiting generative models are basing the
composition on the qualia structure, which provides the `glue' which
links together the semantic contributions of modifying nouns and the
head noun in the compound ([Joh95], [Bus96b], [Joh98],
- bread knife coltello da pane
- wine glass bicchiere da vino
- lemon juice succo di limone
- glass door porta a vetri
Lexicalization is concerned with the linguistic components that are
responsible for the constructions allowed by a word itself. Consider
an example of communicating that there was a shooting event. This can
be done in many ways as shown below:
The sentences above communicate different outcomes of the event of
shooting the president. Whereas the sentence in (23b)
strongly conveys the information that the president died, the one in
(23a) does not. What is clearly responsible for the
different entailments is the lexical content differentiating man
- The man shot the president twice.
- The killer shot the president twice.
- The man shot the president dead.
- ??The killer shot the president dead.
role of events in the representation of nominals constitutes a rather
controversial topic. In general, the only nouns that are taken to
contain events in their representation are those that express an event
directly. For instance, nouns such as war, destruction,
and so on. Within generative models, and more specifically Generative
Lexicon theory, all lexical items carry event based information which
is crucially involved in the compositional processes involving: type
coercion, adjectival modification, event nominalizations and, among
others, also agentive nominals.
Quantification within the semantics of nominals is a very
broad and difficult topic. Within Formal Models the type of entity referred to
determines a range of quantificational phenomena:
Ordinary count nouns can take discrete quantifiers to form multiform
sets, (cf., (24a)). In the case of group nouns,
(24c), groups are counted and not the members of the
group. In the case of collectives, e.g. (24d), the number of
elements is measured and not counted. Genuine substances as in
(24e) represent measurements both as referential entities
and as conceptual units. Finally, we see a case of quantification of
entity types in (24f). The diversity is quantified,
nothing is said about the amount of medicine.
- a policeman/ two/some policemen (singleton and multiform set
- some/two police (multiform set of objects)
- a squadron/ two/some squadrons (singleton and multiform set
- some/much traffic (non-discrete collection of objects)
- some/ much water (non-discrete amount of substance)
- three medicines(the number of medicine-types is counted)
In (25a) we see that the multiplicity (set, group or
collective) of the subject and the object explains that we can have
distributive, collective and cumulative interpretations: each persons
carries a single set of equipment, they all carry one set of
equipment, any combination is possible. In (25b) the
fact that a piano is a single ad big individuated object
suggests that the subject carries collectively. In the next two
examples we see that the multiplicity of the subject is a necessary
selection criterion for the reciprocal interpretation of the
predicate. The final example shows that group nouns exhibit two
different entity levels for quantification and property distribution.
- the police/squadron/policemen carried new equipment (distributive)
- The policemen carried the piano upstairs (collective )
- The family loves each other (reciprocal )
- The committee meets (multiplicity of subject)
- two big healthy families/boys (different property distributions )
The conditions determining agreement vary a lot across languages. In
the English examples, we see that quantificational semantic properties
can play a role. Here we see that the singular subjects can govern a
plural verb when interpreted as a multiform group noun. See also
[Cop95] for a discussion of group nouns.
- The band played/plays well
- The committee met/meets
How is information encoded in lexical databases
There is a fairly large number of treatments which will be surveyed by
the next meeting. These include 3.7, Ontos (Nirenburg),
Nouns in Acquilex are represented using Typed Feature Structures in an HPSG-
tradition, encoding the following types of information:
- morpho-syntactic properties such as number, gender, person,
- formal semantic properties such as set; mass
- conceptual semantic properties
The QUALIA specification roughly corresponds to the Generative lexicon
The SEM specification incorporates formal semantic notions on entity types.
These are correlated with the QUALIA properties. These correlations predict
collective and distributive property distributions for the different multiform
nouns (plural, group, collective) as discussed above. This is illustrated
following feature structures:
In principle, basic semantic networks and taxonomies such as wordnet
can directly be used in various applications. There is however a
direct correlation between the amount of detail in semantic
information that is encoded and the level of interpretation and
information processing that can be achieved.
How is the information used in LE applications
Natural language generation (4.5) faces the serious problem of
connecting a discourse plan with the linguistic module that is
responsible for the surface realization. Nouns semantics is most relevant
for lexical choice,
namely selecting a given lexical item that packages the most appropriate
is best suited for communicating a given meaning. g.
Natural Language Generation
A system may benefit from a way of distinguishing the different profiles
of the individuals in a text, e.g. rebels
versus army (cf. [Hob91]). In discussing some of the
higher-rated mistakes in information extraction based on abductive
inference -- another knowledge-intensive method of NLP --
[Hob91] points to cases of missing axioms (or lexical knowledge) such as
knowing that a bishop is a profession.
Lexical Representations for Information Extraction
Up: Linguistic aspects of lexical
Previous: Verb Semantic Classes
EAGLES Central Secretariat firstname.lastname@example.org