next up previous contents
Next: Bracketing of single-word constituents Up: Issues in practical application Previous: Ambivalence

Recommendations

 

Punctuation

In the phrase structure examples given in the following sections, we have included punctuation within the bracketing. This again is a choice to be made by the scheme designers, and will probably depend on the format in which the corpus will finally be presented. However, in the syntactic annotation of written texts, there is a tradition of treating punctuation marks as `words' for the purpose of parsing. This can be advantageous for automatic parsing systems since punctuation is typically used to mark major syntactic boundaries.

As for sentence-initial and sentence-final punctuation, it seems sensible to enclose them within the parse bracketing, as in 85:

(85)  [``Good!'']
as opposed to 86:

(86)  ``[Good]!''
As regards medial punctuation, the most generally applicable guideline is to attach punctuation to the highest available node in the parse tree, thus assigning to medial punctuation symbols (especially commas) their value as delimiters of major constituents, as in 87:

(87)  [S [NP The words [PP at [NP level four NP] PP] NP] , [PP on [NP the other hand NP] PP] , [VP are [CO [ADJP relatively rare ADJP] , and [VP not often used [PP in speech PP] VP] CO] VP] . S]
However, this guideline does not always give a satisfactory analysis of correlative punctuation, such as matching commas or dashes to indicate the opening and closing of a parenthetical constituent. Thus in 88, it makes better sense to place the second comma inside the NP, rather than to make it an immediate constituent of the sentence:

(88)  [S [NP the teacher , [CL-REL who arrived late CL-REL] , NP] [VP had noticed [NP nothing NP] VP] . S]
However, our purpose here is not to dictate solutions. The principle to be adhered to is simply to make explicit in the annotation scheme whatever solution for the treatment of punctuation is adopted by the annotator. (Sampson 1995: 153f provides a considered treatment of punctuation isses.)



next up previous contents
Next: Bracketing of single-word constituents Up: Issues in practical application Previous: Ambivalence