Purpose

Next: Questions of relevance for Up: Tagset evaluation Previous: Tagset evaluation

Purpose

We assume that corpus annotation should provide as much linguistic information as possible, i.e. tags should correspond to a (sub)set of features which are used in further language processing, such as syntactic parsing, semantic analysis, etc.

However, there is a conflict between what linguists or ``higher processing steps'' need and what taggers can technically provide; this conflict must be taken into account when designing a tagset. The following section shows some of the theoretical questions involved in the design phase, some more practical test cases and a description of the phenomena which we chose for our tests.