next up previous contents
Next: Linguistic classification criteria Up: Definitions Previous: Text

Subcorpus, component and sublanguage

  A corpus can be divided into subcorpora. A subcorpus has all the properties of a corpus but happens to be part of a larger corpus. Corpora and subcorpora are divided into components. A component is not necessarily an adequate sample of a language and in that way it is distinct from a corpus and a subcorpus. It is a collection of pieces of language that are selected and ordered according to a set of linguistic criteria that serve to characterise its linguistic homogeneity. Whereas a corpus may illustrate heterogeneity, and also a subcorpus to some extent, the component illustrates a particular type of language. What are called sublanguages are components in this definition, but there are other restrictions on sublanguages which will be dealt with later.