Subcategorisation in Categorial Grammar

Next: Subcategorisation in Dependency Grammar Up: Subcategorisation from a theoretical Previous: Subcategorisation in HPSG

Preliminary Recommendations

Subcategorisation in Categorial Grammar

The Categorial Grammar approach to subcategorisation will be exemplified with a description of the UCG framework used in the ACQUILEX project (Sanfilippo, 1993b). For ease of presentation, only a brief summary is given -- see Sanfilippo (1993b) for a full account.

Sign structure

Words and phrases are represented as (typed) feature structures where orthographic, syntactic and semantic information is simultaneously represented as a conjunction of attribute-value pairs forming a sign:

        [ORTH: orth

         CAT: cat

         SEM: sem]

The category attribute of a sign is either basic or complex. Basic categories are binary feature structures consisting of a category type, and a series of attribute value pairs encoding morphosyntactic information:

        [CAT-TYPE: cat-type

         M-FEATS: m-feats]

Three basic cat-types are used

n (noun);
np (noun phrase); and
sent (sentence).

For graphical convenience, basic category structures are abbreviated as follows:

        cat-type[m-feats]

Morphosyntactic features are included only where needed.

Complex categories are recursively defined by letting the type `cat' instantiate a feature structure with attributes RESult, DIRection and ACTive. RESult can take as value either a basic or complex category, ACTive is of type `sign', and the direction attribute encodes order of combination relative to the active part of the sign (e.g. forward or backward):

        [RES: cat

         DIR: dir

         ACT: sign]

In verbs, the active part of the category structure encodes the subcategorisation properties, e.g. subject and object for transitives:

        [ORTH: < love >

         CAT:[RES:[RES:sent

                   ACT:[np-sign

                        CAT:nom]]

              ACT:[np-sign

                   CAT:np[acc]]]]

The semantics of a sign is a formula. A formula consists of an index, a predicate and at least one argument which can be either an entity or a formula (which are both subsumed by sem)

        [IND: entity

         PRED: pred

         ARG1: sem]

The index of a formula is an entity which provides partial information about the ontological type denoted by the formula, e.g. `e' for eventualities and `o,x,y,z' for individual objects. In addition a contentless entity, `dummy', is employed in the semantic characterisation of pleonastic noun phrases, e.g. subject of extraposition verbs. The argument of a predicate can be either an entity or a formula. For ease of exposition formulas are linearised, e.g. the feature structure

        [IND: [1] x

         PRED: book        

         ARG1: [1]]

where [1] flags reentrant (e.g. identical) values is abbreviated as <x1>book(x1) where x1 is a named variable.

Subcategorisation

The classification of subcategorisation types involves defining

Semantic predicate-argument structures
Category structures
Verb signs where links are made across the semantic and category structures

Semantic predicate-argument structures

Verbs are characterised as properties of eventualities, and thematic roles are relations between eventualities and individuals, e.g.

        <e1>and(<e1>sleep(e1), <e1>agent(e1,john))

Following Dowty (1989), the semantic content of thematic relations is expressed in terms of prototypical cluster-concepts -- the proto-agent and proto-patient roles (`p-agt', `p-pat') -- determined for each choice of predicate through attribution of selected entailments which qualify the relative agentive strength and affectedness of event participants. Dowty's insights are augmented by introducing a third proto-role, `prep' for prepositional arguments (`semantically restricted' in LFG terms) and the contentless predicate `no- tex2html_wrap_inline7571

' to characterise the relation between a pleonastic NP to its governing verb. In addition, proto-roles are formalised as supersets of specific clusters of meaning components which are instrumental in the identification of semantic verb classes (Sanfilippo & Poznański, 1992; Sanfilippo, 1993b; Sanfilippo, 1993a) -- see examples.

A primary semantic classification of verb types is obtained in terms of argument arity. Further distinctions are made according to what kind of verbal arguments are encoded:

proto-agent, e.g. `John' in ``John sleeps'' and the sentences below
proto-patient, e.g. ``a book'' in ``John read a book''
prepositional
- oblique/indirect, e.g. ``to Mary'' in ``John gave a book to Mary''
- objective, e.g. `Mary' in ``John gave Mary a book''
non-thematic, e.g. `Bill' in ``Bill seems to be sad''
pleonastic, e.g. `It' in ``It bothers Bill that Mary left''
predicative (`xcomp'), e.g. ``to leave'' in ``John wishes to leave''
sentential (`comp'), e.g. ``that Mary left'' in ``John said that Mary left''

Here are some of the semantic structures distinguished:

For strict intransitive verbs, e.g. ``John sleeps'':

        STRICT-INTRANS-SEM

           <e1>and(<e1>pred(e1), <e1>p-agt(e1,x)

The semantics of strict transitive verbs, e.g. ``John drinks a beer'':

        STRICT-TRANS-SEM

           <e1>and(<e1>pred(e1), <e1>and(<e1>p-agt(e1,x),

                                         <e1>p-pat(e1,y)))

For transitives taking an oblique complement and ditransitives, e.g. `Mary gave a book to Bill'' and ``Mary gave Bill a book'':

        OBL-TRANS/DITRANS-SEM

           <e1>and(<e1>pred(e1), <e1>and(<e1>p-agt(e1,x), 

                                         <e1>and(<e1>p-pat(e1,y),

                                                 <e1>prep(e1,y))))

Intransitive verbs with a thematic subject which take a clausal complement, e.g. ``John intended to come' ' and ``Bill thought that John would come'':
```
        P-AGT-SUBJ-INTRANS-XCOMP/COMP-SEM

           <e1>and(<e1>pred(e1), <e1>and(<e1>p-agt(e1,x), verb-sem))
```

Category structures

Category structures are distinguished according to the values for the features RES and ACT. For example, the CAT of strict intransitives states that the result is a basic category of type `sent' and the active part is a noun phrase (i.e. there is only subject selection):

        STRICT-INTRANS-CAT

                [RES: sent

                 ACT: np-sign]

More complex category types can be built using more basic category types, e.g.

For strict transitives e.g. ``Bill ate a sandwich''

        STRICT-TRANS-CAT

                [RES: strict-intrans-cat

                 ACT: [np-sign

                       CAT: np[acc]]]

Morphosyntactic restrictions are encoded in selected (active) signs. For example, the outermost argument bears accusative case in the definition of the ditransitive category and prepositional case (`p-case') in the category definition for transitives which take a PP complement:

For e.g. ``John gives Mary a book''

        DITRANS-CAT

                [RES: strict-trans-cat

                 ACT: [np-sign

                       CAT: np[acc]]]

For transitives which take an prepositional complement e.g. `John gave a book to Mary''

        OBL-TRANS-CAT 

                [RES: strict-trans-cat

                 ACT: [np-sign

                       CAT: np[p-case]]]

The remaining category types are organised into:

`comp-cat' for verbs which take a sentential complement;
`xcomp-cat' for verbs which take a predicative complement, further divided according to whether control is involved or not.

Control categories are used to describe the syntactic structure of both equi and raising verbs. All control categories follow (inherit from) the following pattern where the reentrancy tag [1] says that the complement active sign (e.g. the complement subject) is controlled by the immediately preceding active sign (control is expressed by equating entities which partially describe the semantics of active signs):

        CONTROL-CAT

                [RES: [RES: cat

                       ACT: [sign

                             SEM:ARG2: [1] entity]]

                 ACT: [sign

                       CAT:ACT: [sign

                                 SEM:ARG2: [1]]]]

The controlling argument can be the subject or the object according to whether the verb is transitive or intransitive (transitivity is determined by the presence of an accusative active np-sign).

        INTRANS-CONTROL-CAT

                [RES: [RES: sent

                       ACT: [sign

                             SEM:ARG2: [1] entity]]

                 ACT: [sign

                       CAT:ACT: [sign

                                 SEM:ARG2: [1]]]]



        TRANS-CONTROL-CAT

                [RES: [RES: strict-intrans-cat

                       ACT: [np-sign

                             CAT: np[acc]

                             SEM:ARG2: [1] entity]]

                 ACT: [sign

                       CAT:ACT: [sign

                                 SEM:ARG2: [1]]]]

Actual control categories are built adding further specialisations to the control descriptions above. For example, the category structure for intransitive equi verbs is defined as follows:

For intransitive control verbs which take an infinitive VP complement e.g. ``Jon wants/seems to leave''

        INTRANS-VPINF-CONTROL-CAT

        Inherits from INTRANS-CONTROL-CAT

                [RES: strict-intrans-cat

                 ACT: [vp-sign

                       CAT:RES: sent[fin]]]

Verb signs

Verbs signs are defined by linking active signs in the category structure to argument slots in predicate argument structures. This is done by means of reentrancy links, as indicated by the tag [1] in the structure for strict intransitive verbs below.

        [strict-intrans-sign

         CAT:ACT: [np-sign

                   SEM: [1] <e1>p-agt(e1,x)]

         SEM: [strict-intrans-sem

               <e1>and(<e1>pred(e1), [1])]]

Since only templates for verbs which have a maximum of 3 arguments are given, only two additional general linking patterns are needed:

        [two-arguments-verb-sign

         CAT: [RES: [RES: sent

                     ACT: [sign

                           SEM: [1]]]

               ACT: [sign

                     SEM: [2]]]

         SEM: <e1> and(and(pred(e1),[1]),[2])]



        [three-arguments-verb-sign

         CAT: [RES: [RES: [RES: sent

                           ACT: [sign

                                 SEM: [0]]]

                     ACT: [sign

                           SEM: [1]]]

               ACT: [sign

                     SEM: [2]]]

         SEM: <e1> and(and(and(pred(e1),[0]),[1]),[2])]

To conclude, here are some sample two-arguments-verb-sign and three-arguments-verb-sign structures

TWO-ARGUMENTS-VERB-SIGN

The type of strict transitive e.g. ``John reads a book'':

        STRICT-TRANS-SIGN 

                [CAT: strict-trans-cat 

                 SEM: strict-trans-sem]

For subject equi verbs which take an infinitive VP complement, e.g. ``Jon wants to leave'':

        SUBJ-EQUI-INTRANS-VPINF-SIGN

                [CAT: intrans-vpinf-control-cat

                 SEM: p-agt-subj-intrans-xcomp/comp-sem]

THREE-ARGUMENTS-VERB-SIGN

Ditransitives, e.g. ``John gives Mary a book'':

        DITRANS-SIGN 

                [CAT: ditrans-cat  

                  SEM: obl-trans/ditrans-sem ]

Transitives which take an oblique object. According to Dowty's obliqueness hierarchy, the oblique complement (`np[p-case]') is the outermost subcategorised argument although it is the direct object NP which follows the verb , e.g. `John gives a book to Mary'':
```
        OBL-TRANS-SIGN

                [CAT: [RES: strict-intrans-cat

                       ACT: [np-sign

                             CAT: np[p-case]]]

                 SEM: intrans-obl-sem]
```

Note that subcategorised arguments are positioned in the category structure of predicates according to the `obliqueness hierarchy' as in HPSG. For example, the `goal' argument of ditransitive and transitives which subcategorise for a PP (DITRANS-SIGN and OBL-TRANS-SIGN above) is the outermost sign in the category structure, even though only in ditransitives does it precede the `theme' object (the difference in word order is handled syntactically, see Sanfilippo (1993b) and references therein).

Next: Subcategorisation in Dependency Grammar Up: Subcategorisation from a theoretical Previous: Subcategorisation in HPSG