next previous contents
Next: References Up: Recommendations for morphosyntactic categories Previous: Dealing with ambiguity

Recommendations

Multiple tagging practices: Form-function and lemmatisation

Ambiguity is just one of a number of phenomena for which some kind of multiple tagging of the same textword may be required. Other cases of multiple tagging which should be mentioned are:

1. Form-function tagging:
Sometimes the need is felt to assign two different tags to the same word: one representing the formal category, and the other the functional category, e.g.:

In principle, it can be argued that two tags should be assigned to each of these word types, and should be distinctly encoded. In practice, tagging schemes up to the present have tended to give priority of one criterion over another (i.e. giving priority to function over form or vice versa). The annotation scheme for a given tagged corpus should clearly state the use of such criteria.

2. Lemma tagging:
A morphosyntactically tagged corpus is generally supposed to specify the grammatical form of a textword, rather than to recover the lemma. However, in transfer of information from a corpus to a lexicon or vice versa, it is assumed that a lemmatisation algorithm will have an important role. There is also a case (especially as a preliminary to syntactic and semantic annotation) for a type of annotation which specifies the lemma, as well as the grammatical form, for each textword. Lemma tagging, as this process may be called, has so far not been widely undertaken. Once again, the need is for independent ways of representing the lemma tag and the grammatical form tag.

For both the above cases of multiple tagging, as well as for the tagging of ambiguity, there is a need for assigning more than one morphosyntactic tag to the same word. There is a case for preference for a vertical format for presenting such a multiply-tagged annotated corpus. The combination of different kinds of word tagging in the same annotated corpus can then be managed, without confusion, by associating each kind of tag with a different field or column alongside the vertical text.


next up previous contents
Next: References Up: Recommendations for morphosyntactic categories Previous: Dealing with ambiguity