Next: Obligatory attributes/values
Up: Recommendations for morphosyntactic categories
Previous: Conclusion: Manageable levels of
Four degrees of constraint are recognised in the description of word
categories by means of morphosyntactic tags:
- Obligatory attributes or values have to be included in any
morphosyntactic tagset. The major parts of speech (Noun, Verb, Conjunction,
etc.) belong here, as obligatorily specified.
- Recommended attributes or values are
grammatical categories which occur in conventional grammatical descriptions
(e.g. Gender, Number, Person).
- Special extensions are subdivided to yield two constraints:
- Generic attributes or values are not usually
encoded, but may
be included by anyone tagging a corpus for any particular purpose. For
example, it may be desirable for some purposes to mark semantic classes such
as temporal nouns, manner adverbs, place names, etc. But no specification of
these features is made in the guidelines, except for exemplification purposes.
They are purely optional.
- Language-specific attributes or values may be
a particular language, or maybe for two or three languages at the most, but
do not apply to the majority of European languages.
In practice, generic and language-specific features cannot be clearly distinguished.
Type special extensions is an acknowledgement that the guidelines are
not closed, but allow modification according to need.
The four types above
correspond to the four types of constraint applied to word categorisation
in the lexicon. In general, this document repeats (in a somewhat different form)
much of the material dealing with
morphosyntactic categorisation in the lexicon,
where further information on particular
features of the classification can be obtained.