A further issue that requires explicit handling in the documentation is the following. What principle is applied in deciding whether to bracket non-terminal constituents consisting of one word? Two extreme policies here would be (a) to leave unbracketed all single-word constituents, and on the other hand (b) to bracket all single-word constituents where they show their phrasal status by the possibility of adding modifers, or replacing them by a multi-word phrase.
There are difficulties with both these solutions. Solution (a) is a problem, for example, where coordination occurs between a single word and a multi-word constituent, as in 89:
|(89)||[NP [NP John NP] and [NP his sister NP] NP]|
|(90)||[NP [DETP many DETP] [ADJP recent ADJP] [ADJP unexpected ADJP] arrivals NP]|
Of these two guidelines, (b) is a problem only to the human reader. When software is used to display bracketed sentences as trees (or in any other format), the proliferation of bracketings is not an issue at all. Thus, rule (b) may be considered preferable, especially for interchange and retrieval purposes, since the filtering out of extra brackets is a much simpler task than the insertion of new ones. However, if a compromise is needed between these two extremes, a useful compromise may be to bracket single-word constituents only where they represent major constituents in the sentence, e.g. as Subject or Object, or where they are in coordination with other multi-word constituents, as in 90 above. Such conditions need to be made explicit in the documentation of the annotation scheme. (For an actual application, see Sampson 1995: 172f for an account of how single word constituents are handled in the SUSANNE corpus.)