next up previous contents
Next: The Text Encoding Initiative Up: Existing transcription and representation Previous: Events represented in the

Transcription conventions adopted by the Network of European Reference Corpora (NERC)

 

The NERC project has aimed at the definition of a minimal level of textual representation for European corpora, both written and spoken. In his final report on phonetic and prosodic annotation Teubert (1993:2) concludes that:

After careful analysis, the NERC consortium has decided to recommend the Transcription Conventions as developed by J.P. French, and in particular the level two transcription rules, for orthographic transcripts.

French's system (French, 1991, 1992) was mainly used for the transcription of the spoken corpus developed within the COBUILD project, a joint venture between the University of Birmingham and Collins publishers established in 1980 (Sinclair (Ed.), 1987). The transcription system involves four levels that will be described in more detail in 2.5.2. The recommended Level Two is an enhanced orthographic representation that contains basic information about the speaker, turn-taking and non-verbal elements -- speaker identity, speaker change, overlaps, laughs, etc. According to French, this is suitable for linguistic studies that do not require intonational information.

Illustrations of the system can be found in French (1991) and Payne (1995) for English, Pisa (1992) for Italian, De Jong (1992) for Dutch, Scheiter (1992a, b) for German, or Villena-Ponsoda (1992, 1994) for Spanish. For a specific treatment of conversational exchanges see Psathas & Anderson (1992).