next up previous contents
Next: Linguistic classification criteria Up: Definitions Previous: Text

Subcorpus, component and sublanguage

  A corpus can be divided into subcorpora. A subcorpus has all the properties of a corpus but happens to be part of a larger corpus. Corpora and subcorpora are divided into components. A component is not necessarily an adequate sample of a language and in that way it is distinct from a corpus and a subcorpus. It is a collection of pieces of language that are selected and ordered according to a set of linguistic criteria that serve to characterise its linguistic homogeneity. Whereas a corpus may illustrate heterogeneity, and also a subcorpus to some extent, the component illustrates a particular type of language. What are called sublanguages are components in this definition, but there are other restrictions on sublanguages which will be dealt with later.



Converted into html by Alessandro Enea
Mon May 15 10:24:42 DFT 1995