next up previous contents
Next: Spoken corpus Up: Characteristics Previous: Simplicity

Documented

  The default value is documented. This means that, as proposed in =1 (; NERC1994), full details about the constituents of a component are kept separately from the component itself. The model for this is the DTD or header of SGML, and, following that, TEI. In contrast to the recommendations of those bodies, corpus users seem to prefer to keep the documentation of texts in a separate place from the texts themselves, and to include only a minimal header that contains a reference to the documentation. For the management of corpora this practice allows the effective separation of plain text from annotation with only a small amount of programming effort; since DTDs can be extremely verbose, the efficiency of real time search procedures is hampered if they are not detachable.