...here
This means that the issue of tokenisation (segmentation of text into words and sentences) is not dealt with in this report. For example, the issues of whether to split off enclitics, and whether to treat multi-word expressions such as compounds as a single token, belong in part to morphosynctactic analysis and in part to text representation. For the same reason, we do not deal here with the representation of merged forms such as French du (=de + le).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...coextensive
In spoken language, `orthographic sentence' plays no role, sentences being delimited in practice on the basis of syntax and possibly intonation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...annotation
In a skeleton parse, constituents are identified with brackets, but much detail is omitted: for example, some brackets are left unlabelled, and functional labels such as Subject and Object are not applied.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...Corpus
Some of these names refer to the parsing system employed and others to the resulting treebank or syntactically annotated corpus.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...brackets
Currently the Helsinki Constraint Grammar is also being extended to Basque, but since Basque is not an official EU language, it will not be covered in this report. However, as Basque is typologically distinct from the Indo-European languages with which the EAGLES guidelines are initially largely concerned, efforts at parsing it may produce interesting results.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...HREF=#deptree#617>
But not always: note that a dependency analysis may have crossing branches, or tokens which are dependents of more than one head.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...exemplar
In the sense that ENGCG is the only wide coverage parser. Other parsers using a dependency syntax exist. See references in Fraser & Hudson (1992).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...indices
These logical relations are marked only in the second phase of the Penn Treebank Project.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
......
Discourse notions like `utterance' will not be dealt with in this report, although they are sometimes used to mark the outer bounds of parse trees.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...letter
Of course,the initial capital letter may in some cases be preceded by sentence initial punctuation marks, such as the initial inverted question mark preceding Spanish questions.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...[S ... S]
By `exhaustive' we mean that all tokens which are part of the verbal content of the text should be included in an [S ... S]. This recommendation does not apply, of course, to words or symbols which are part of the mark-up of the text, such as those representing (in spoken transcriptions) pauses or the beginning and ending of a speaker's turn.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...HREF=#issue#1015>
Throughout these guidelines, as in this example, the syntactic annotations are purely illustrative, and are not meant to serve as a model to be imitated in all details.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...Subject
At this time, the representation of the Lexicon/Syntax Subgroup is still in progress, but the properties of syntactic features such as case and t +/- predicative combine together to give grammatical function information such as Object (from the combination of features: t -subject, -predicative, case=accusative), Subject Predicative Complement (from the combination of features: +subject, +predicative) etc.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...HREF=#exdepprop#2110>
Other columns can be added to hold other types of information, such as POS tags.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...annotation
Since we are here concerned with syntactic ambiguity only, we ignore types of ambiguity (e.g. purely lexical ambiguity or pragmatic ambiguity) which have no bearing on syntactic annotation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...speak
A third type of use ambiguity occurs where the human interpreter cannot make sense of the syntax of the sentence. For example, the following sentence from a computer manual cannot be readily interpreted or parsed by a non-specialist in computing: In this situation, the operator must ready a spool volume and IPL again.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...labelling
We exclude from consideration here ambiguities of part-of-speech tagging at word level. A special case of labelling ambiguity, these have been dealt with in draft guidelines for morphosyntactic annotation (see EAGLES (1996a)).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.