A wide variety of parser and/or grammar evaluation methods have been used (and sometimes justified) in the literature. We describe the most important of these below, each with a brief critique; in §4 we discuss more general problems with evaluation, and propose a higher-level and more task orientated language in which to represent the information a parser should extract, together with suitable evaluation measures.
Extant parser/grammar evaluation methods divide into non-corpus and corpus-based methods, with the latter subdividing further into unannotated and annotated corpus-based methods.