next up previous contents
Next: Contents Up: EAGLES Home Page

Preliminary Recommendations on Semantic Encoding
Interim Report

The EAGLES Lexicon Interest Group

Antonio Sanfilippo <> (Chair)
Nicoletta Calzolari <> (Coordinator)
Sophia Ananiadou <> (Subgroup coordinator)
Rob Gaizauskas <> (Subgroup coordinator)
Patrick Saint-Dizier <> (Subgroup coordinator)
Piek Vossen <> (Subgroup coordinator)
Antonietta Alonge <>
Nuria Bel <>,
Kalina Bontcheva <>
Pierrette Bouillon <>
Paul Buitelaar <>
Federica Busa <>
Mouna Kamel <>
Maite Melero Nogues <>
Simonetta Montemagni <>
Pedro Diez-Orzas <>
Wim Peters <>
Fabio Pianesi <>
Vito Pirrelli <>
Frederique Segond <>,
Christian Sjögreen <>
Mark Stevenson <>
Maria Toporowska Gronostaj <>
Marta Villegas Montserrat <>
Antonio Zampolli <>

May 1998


Within the last decade, the availability of robust tools for language analysis has provided an opportunity for using semantic information to improve the performance of Machine Translation and document management applications such as Information Retrieval/Extraction and Summarization. As this trend consolidates, the need for a protocol which helps normalize and structure the information needed for the creation of resusable lexical resources within the applications of focus becomes more pressing. The goal of this report is to address this requirement through the provision of guidelines for standards in the encoding of semantic information in lexical resources for Machine Translation and Information Systems.

The present version of the report provides an account of ongoing activities of the EAGLES Lexicon Interest Group on semantic encoding relative to the period May 1997 through February 1998. EAGLES is an initiative sponsored by the European Commission to promote the creation of standards in the area of Language Engineering. Currently, EAGLES is in its second round of funding which is scheduled to last till September 1998. For information on the current EAGLES project visit the web site For information on previous EAGLES activities see

The goal of the Lexicon Interest Group is to provide guidelines for the standardization of lexical encoding. The current work is intended to provide preliminary recommendations on lexical semantic encoding. This work is meant to extend the results of standardization activities in the area of syntactic subcategorization previously carried out by the EAGLES Lexicon Interest Group, see

The proposed extension addresses the annotation of

The work has a special focus on requirements for Machine Translation and Information Systems. These applications have had a major impact on the electronics and computing industry and are therefore excellent candidates in providing a focus for standardization in the area of language engineering.

The workplan includes the survey of current practices in the encoding of semantic information and the provision of guidelines for standards in the area of focus with reference to:

According to this plan, the present report provides a survey of lexical semantic notions as

In keeping with the overall objective of the group, the goal of this survey is to provide the basic environment in which to table proposals for standardization in lexical semantic encoding. More specifically, the survey is intended to be indicative of the sort of issues at stake in standardization, e.g. what are the basic notions and how they are utilized. Such an indication is achieved through a comparative assessment of the status of lexical semantics in theoretical linguistics, lexicography and language engineering.

Although we have tried to cover all major topics and applications concerned with equal attention, we recognize that the survey may fail to address several relevant areas and possibly provide too specific a treatment of others. Meaning is such a pervasive aspect of linguistic knowledge that a totally unbiased and perfectly balanced survey of theoretical and applied work involving lexical semantic notions is a very difficult, perhaps impossible, project to carry out. Fortunately, such a limitation is not likely to affect our purposes. Indeed, our choice of lexical semantics issues and language resources to be surveyed was driven by the decision to concentrate on the requirements of applications such as Machine Translation and Information Systems. By narrowing the focus of application, we hope to have avoided the danger of unmotivated biases.

Finally, we would like to remind the reader that this is an interim report and should be considered as a ``polished draft''. The final report, available in September 1998, will provide a revised version of the present survey plus an additional part on guidelines for standards in the area of lexical semantic encoding with specific reference to Machine Translation, Information Systems and related component technologies.


EAGLES Central Secretariat