next up previous contents
Next: Component Technologies Up: Areas of Application Previous: Text Summarization


Natural Language Generation


The problem of automatic production of natural language texts becomes more and more salient with the constantly increasing demand for production of technical documents in multiple languages; intelligent help and tutoring systems which are sensitive to the user's knowledge; and hypertext which adapts according to the user's goals, interests and prior knowledge, as well as to the presentation context. This section will outline the problems, stages and knowledge resources in natural language generation.


Natural Language Generation (NLG) systems produce language output (ranging from a single sentence to an entire document) from computer-accessible data usually encoded in a knowledge or data base. Often the input to a generator is a high-level communicative goal to be achieved by the system (which acts as a speaker or writer). During the generation process, this high-level goal is refined into more concrete goals which give rise to the generated utterance. Consequently, language generation can be regarded as a goal-driven process which aims at adequate communication with the reader/hearer, rather than as a process aimed entirely at the production of linguistically well-formed output.

Generation Sub-Tasks

In order to structure the generation task, most existing systems divide it into the following stages, which are often organised in a pipeline architecture:

Content Determination and Text Planning:
This stage involves decisions regarding the information which should be conveyed to the user (content determination) and the way this information should be rhetorically structured (text planning). Many systems perform these tasks simultaneously because often rhetorical goals determine what is relevant. Most text planners have hierarchically-organised plans and apply decomposition in a top-down fashion following AI planning techniques. However, some planning approaches (e.g., schemas [McK85], Hovy's structurer [Hov90]) rely on previously selected content - an assumption which has proved to be inadequate for some tasks (e.g., a flexible explanation facility [Par91,Moo90])

Surface realisation
: Involves generation of the individual sentences in a grammatically correct manner, e.g., agreement, reflexives, morphology.

However, it is worth mentioning that there is no agreement in the NLG community on the exact problems addressed in each one of these steps and they vary between the different approaches and systems.

Knowledge Sources

In order to make these complex choices, language generators need various knowledge resources:

The formalism used to represent the input semantics also affects the generator's algorithms and its output. For instance, some surface realisation components expect a hierarchically structured input, while others use non-hierarchical representations. The latter solve the more general task where the message is almost free from any language commitments and the selection of all syntactically prominent elements is made both from conceptual and linguistic perspectives. Examples of different input formalisms are: hierarchy of logical forms [McK90], functional representation [Sig91], predicate calculus [McD83], SB-ONE (similar to KL-ONE) [Rei91], conceptual graphs [Nik95].

Related Areas and Techniques

Machine Translation (§4.1) , Text summarisation (§ 4.4).

next up previous contents
Next: Component Technologies Up: Areas of Application Previous: Text Summarization
EAGLES Central Secretariat