The classification of texts into different genres seems to have been mostly achieved through external criteria rather than through internal, linguistic criteria. Bhatia (1993) claims that the nature of genre can be specified through external criteria. The particular external criteria that are usually specified consists of information concerning the speaker/writer and audience, as well as the relations between these two groups, the intended goals of the author, the historical, socio-cultural, philosophic and occupational details. Bhatia would further determine the genre of a text by relating the particular text to other texts of the same genre and by identifying topic, again externally.
Whether the text is written or spoken is a basic distinction, though Sager (1980) identifies types of text within the language of science and technology which can be both, such as the report, the memo and the schedule where the medium would not necessarily determine the text type. Types are distinguished instead on grounds of the status of the participants and their knowledge or authority on the subject of the text. These modes of expression are also based on the level of pre-planning, that is whether the text is spontaneous or prepared.
Sager also claims that such matters as informative and evaluative intentions provide criteria for subdivision of texts. This is the basic classification of the written part of the British National Corpus -- which is primarily divided into the categories of informative and imaginative writing -- the intention being the fundamental division between text types. Informative writings include the fields of sciences, arts, commerce and finance, belief and thought, leisure, natural and pure science, social science and world affairs. Imaginative writing is literary and creative works. These types are then further categorised according to other criteria, such as medium, date of publication, topic and so on.
An early distinction between informative and other text -- whether the other is said to be evaluative or imaginative -- is of doubtful validity because it perpetuates the illusion that many texts have as their principal aim the transfer of information. Oversimplifications regarding the concept of information abound in informatics -- in fact the science of informatics is based on the oversimplification that information can somehow be decoupled from the rest of communication. In contrast, Hunston (1989) has shown that the blandest and -- apparently -- most objective scientific report is seething underneath with evaluation.
The category `factual' is far wider than `informative', but often the two seem to be merged. The distinction between fact and fiction is relevant, but it is a distinction that cuts across others, and concerns the relation of the utterances to objective reality, and not in the first instance to their purposes. In contrast, `informative' is a term that relates to purpose.
Hence, in the typology proposed, the category of information is one of the categories of outcome, and is restricted to reference works that appear to have no practical relevance beyond being repositories of data.
Halliday & Martin (1993) do not offer a formal text classification system. They identify the areas of field, mode and tenor through internal criteria. Since language varies with situation (which is presumably the fundamental reason for text classification at all), we can identify particular specialised languages through the specification of certain internal linguistic criteria -- that is, the frequency of lexico-grammatical features. Halliday & Martin associate scientific text with passives and nominalisations and so on. However, here it would seem that some sort of initial external classification is necessary, to begin with at least, in order to specify the characteristics of a particular genre, here a `scientific text'. The internal linguistic criteria of the text is analysed subsequent to the initial selection based on external criteria. The linguistic criteria are subsequently upheld as particular to the genre of the `scientific text'. This classification begins with external classification and subsequently focuses on linguistic criteria. If the linguistic criteria are then related back to the external classification and the categories adjusted accordingly, a sort of cyclical process ensues until a level of stability is established. Such a process would additionally check that there were no other prominent varieties that showed similar linguistic features -- a point that can be readily overlooked when an investigation starts with externally-defined texts.
Biber (1989), on the other hand, rejects typologies which have been determined by initially isolating an important external difference among texts and have subsequently attempted to identify the linguistic features associated with that difference. Biber therefore establishes a typology of texts based on internal linguistic criteria only and then interprets the results with reference to external `functions'. Biber's internal criteria are taken from published studies of language variation; therefore they can be assumed to be relevant. The resultant typology ultimately distinguishes eight text types based on five `dimensions' identified automatically through cluster analysis, as we will see below.
This classification of texts based purely on internal criteria does not give prominence to the sociological environment of the text, thus obscuring the relationship between the linguistic and non-linguistic criteria. Atkins et al. (1992) believe that:
it is impossible to `balance' a corpus on the basis of extra-linguistic features alonebut similarly that:
a corpus selected entirely on internal criteria would yield no information about the relation between language and its context of situation.It seems most feasible that texts be selected initially on external criteria, then once the particular linguistic features of a `text type' have been established through analysis of internal criteria these can be used in selection and classification of texts. Biber later suggests that a cyclical process of refinement between internal and external criteria is necessary (Biber, 1993).