ISLE is the latest in a series of projects under the successful EAGLES initiative (Expert Advisory Group for Language Engineering Standards). It extends hitherto EU-based EAGLES work within the EU-US international research cooperation framework, set up as a result of two years of joint preparatory work towards an international HLT standards oriented initiative.
The overall goals of ISLE are to support HLT and national projects, and HLT industry in general, by developing, disseminating and promoting widely agreed and urgently needed HLT de facto standards and guidelines, for infrastructural language resources, for the tools that exploit them, and for HLT products. The areas currently targetted by ISLE are: multilingual computational lexicons, natural interaction and multimodality, and evaluation of HLT systems.
A feature of EAGLES/ISLE work is the close interaction between industry
and academia, users and providers, funders and beneficiaries.
Work on evaluation of HLT systems in ISLE focusses on machine translation (MT). The major result this year is the creation of a web site that classifies different aspects of MT evaluations, which can be used by those wishing to use a MT system, to evaluate MT systems or to design/upgrade a MT system.
Work on natural interaction and multimodality led to a significant workshop at LREC 2000, where an international audience helped advance towards consensus on metadata representation and annotation for multimodal/multimedia language resources.
An important development took place at the EU-US CLWG meeting held at
the University of Pennsylvania in December 2000: two representatives from
Japan (Kyoto University and TIT) and one from Taiwan (Academica Sinica)
were invited to participate. The UPenn contingent included also some native
Korean researchers who presented work on Korean/English bilingual dictionaries.
A number of important bi- and multilingual lexical resources was identified. Each resource has been investigated to determine its lexical structure and how it encodes cross-language relationships. This has resulted in a major survey report, documenting results in a format that allows easy comparison of the lexical mechanisms employed by each lexicon considered. The emphasis is on the semantic level of description, as benefits a multilingual perspective, however other levels of lexical information are also taken into account, where these have a bearing on interpretation and on cross-language mapping. This work was undertaken to enable the subsequent identification of a set of basic notions needed to describe the multilingual level, together with other notions that may be recommended for particular purposes or languages.
In order to facilitate the development of MILE, a joint EU-US meeting agreed that the lexical model of PAROLE/SIMPLE, based on previous EAGLES recommendations, would be taken as a starting point, and modified as necessary in the light of the survey results.
A tool to manage computational lexicons modelled according to ISLE recommendations
is being developed.
This has involved the development of not one, but three parallel taxonomies that describe relevant aspects of the nature and use of MT: characteristics of machine translation purpose, characteristics of the machine translation process, and general software characteristics. In addition, individual evaluation measures have been identified and classified into appropriate groupings, and criteria have been developed for the application of each measure.
The results of this work are embodied in a web site that is intended to help 3 types of user: people who want to use a MT system; those who want to evaluate various MT systems; and those who want to design a new MT system or upgrade an existing one.
A workshop
on MT Evaluation was held in conjunction with LREC 2000, and another
on Hands-On Evaluation at AMTA 2000.
In order to further this work, an international workshop was held at LREC 2000, the major biennial event on language resources.
A workshop on
Web-Based Language Documentation and Description was held in Philadelphia
on December 12th-15th 2000. It was the initiative of two members of the
US ISLE group and was jointly sponsored by IRCS (the Institute for Research
in Cognitive Science of the University of Pennsylvania), ISLE and the NSF
TalkBank project. It was attended by over 100 international participants
from a wide range of industrial, academic and governmental areas. Links
have been established with DublinCore, MPEG7 and W3C.
LREC
2000 (numerous papers by EAGLES/ISLE participants)
Workshop
on the Evaluation of Machine Translation
Workshop
on Web-Based Language Documentation and Description (sponsored by ISLE)
First
EAGLES/ISLE Workshop on Meta-Descriptions and Annotation Schemes for Multimodal/Multimedia
Language Resources and Data Architectures and Software Support for Large
Corpora
Workshop
on Hands-On Evaluation of Machine Translation
Workshop on Web-Based
Language Documentation and Description (sponsored by IRCS, ISLE and
TalkBank)
Regarding evaluation, a workshop on MT evaluation will be held in Geneva, 19th-24th April, 2001. Draft guidelines on evaluation will equally be available towards the end of 2001.
The EU-US CLWG meeting held at the University of Pennsylvania in December
2000 led to a commitment by US ISLE to fund participation of several Asian
representatives at future meetings, as a means of preparing the way for
future heavy Asian involvement in the expanding ISLE initiative. Draft
results of the CLWG survey were sent to an Asian Federation meeting to
engender further feedback and discussion on MILE. A main point of interest
regarding opening to Asia is the Japanese project on Multimedia Annotation
(MMA): exchanges of data and information with this project are planned
to take place at various meetings, concerning both the CLWG and the NIMMWG.
EAGLES Secretariat
CNR - Consiglio Nazionale delle Ricerche
ILC - Istituto di Linguistica Computazionale
Area della Ricerca di Pisa San Cataldo
Via Moruzzi N° 1
56124 Pisa
ITALY
Phone: [+39] 050 315 2873
Fax 1: [+39] 050 315 2834
Fax 2: [+39] 050 315 2839
ISLE reports are placed on the ISLE
web server as they receive approval for dissemination (i.e. as they
are considered to represent a consensus view). Further details of the project
together with links to earlier EAGLES work may also be found at this location.