next up previous contents
Next: Constituency- and Dependency-based Annotation Up: SPARKLE - WP1 Prefinal Previous: Introduction

Existing Syntactic Annotation Practices for Corpora

Syntactic annotation is the practice of adding syntactic information to a text by incorporating into it markers indicating syntactic structure: e.g.\ labelled bracketing, or symbols indicating dependency relations between words. The syntactic annotation practices surveyed here are limited to those which have been extensively applied to large textual corpora, and we will not be concerned with the issue of how the process of corpus annotation is concretely carried out (e.g. whether manually, or fully automatically, or in an interactive manner).

The general purposes of the present survey are to provide background information to the specifications of phrasal parsing adopted within the project. It is based on previous work in the framework of the NERC (Network of European Reference Corpora) project (Montemagni, 1992) and on the EAGLES Recommendations and Guidelines for the Syntactic Annotation of Corpora (Leech et al., 1996). Here we will not describe in detail any particular annotation schema; we will rather consider the different layers of information which may or may not be instantiated within a specific schema and the way these layers can usefully be combined. Reference will be made to syntactically annotated corpora encoding these information layers. The annotation schemata which will be considered here are the following:

In what follows, we will first distinguish two basic classes of annotation schemata depending on the model of syntactic analysis they draw on, to then move on to a more detailed consideration of those layers of linguistic information which are relevant to our purposes.





Sparkle Project