It is only since the mid 1980s that standardisation and reusability issues have taken on increasing
importance in language engineering. Trends in linguistics and computational linguistics made it clear that the lexicon would be
the keystone of linguistic description for the foreseeable future. As confidence grew in being able to build
large computational grammatical descriptions of language, concern also grew over their associated, much
larger, lexical descriptions: if one were to change, for example, one's grammatical theory, how could one
cope with the knock-on effects on the lexicon, especially in a European multilingual context?
In the early 1980s, work had begun on extraction of information from publishers' machine readable
dictionaries, with little co-operation and few common frameworks. Nevertheless, it was the lexicon field which
provided the impetus for standardisation issues to be addressed in other fields. A seminal workshop which
set in motion a number of efforts towards standardisation and reusability took place in 1986.
Numerous recommendations arose from this workshop leading to a common
statement that high priority should be assigned to the design and development of large, reusable,
multifunctional, precompetitive, multilingual linguistic resources, implying the need for standards.
Various follow-up meetings took place between 1986 and 1988, at workshops and conferences. Similar
events have taken place in increasing numbers since then. Also, bodies such as EURALEX and the
Association for Computational Linguistics (via SIGLEX) have sponsored the formation of groups that
worked on relevant issues. However, it was especially the EC that took a strong interest in encouraging
developments in the area of reusable linguistic resources, from an early date. Four projects were initiated
in various EC frameworks at the beginning of the 1990s, looking at aspects of the reusability area:
ACQUILEX, EUROTRA-7, MULTILEX and NERC (Network of European Reference
Corpora). Almost
immediately, these projects, which were all concerned with preparatory or enabling aspects, began to
discuss ways and means of co-operating to reach common objectives, under the aegis of the EC. At
roughly the same time, the EUREKA project GENELEX was set up to establish a convergence path for various
lexical assets owned by that consortium into a common conceptual structure.
GENELEX, MULTILEX and EUROTRA-7 had much in common. Importantly, there
was also a degree of overlap among these projects in terms of participants. Given their common interests
and the growing importance of the need for standards, they came together to form a Co-ordination Group,
under the aegis of the EC. NERC became a focal point for
corpus-related work in the Union, in which standards for, e.g., corpus annotation became of interest.
In the speech community, the significant ESPRIT SAM project covered many aspects of
speech and resulted in a wealth of documents on standards related aspects and numerous standard tools
for speech.
The situation sketched above created the appropriate conditions for the launching, firstly, of a project
definition study in 1992 and then, in February 1993, of the LRE EAGLES project, aiming specifically at
defining standards for selected areas or preparing the ground for future standard provision.
Since the formation of EAGLES, standards related work in the EU has been largely concentrated within
this initiative. Related efforts elsewhere are closely linked with EAGLES and feed off it. Besides
developing from pre-existing preparatory work on the lexicon, on corpora and on speech aspects,
LRE EAGLES initiated standards related work in the areas of NLP formalisms and evaluation of NLP
systems. Evaluation in particular has proved to be a point of focus of many interested groups and projects
throughout the world and the EAGLES evaluation group has had significant input from them: indeed, it
has acted as a catalyst and testing-ground.
Several LRE projects actively contributed comments and tested EAGLES proposals, thus
offering a concrete industry-related setting. Given the amount of industrial participation in EAGLES
itself, it is notable that there has been significant advance in LE standards over the past two years, thus
re-emphasising the need to involve industry in such efforts in targeting clearly identified and motivated
standardisation goals.
EAGLES results to date are to be seen as a first step on the path towards standardisation for LE purposes.
The immediate future is thus, besides the development of proposals covering new areas, the time for
consolidation, testing, refinement and dissemination of these initial results, via a second cycle of
EAGLES activity which is the focus of the present project.