http://www.ilc.pi.cnr.it/EAGLES96/isle/ISLE_Home_Page.htm
ISLE is
the latest in a series of projects under the successful EAGLES initiative (Expert
Advisory Group for Language Engineering Standards). It extends hitherto
EU-based EAGLES work within the EU-US international research cooperation
framework, set up as a result of two years of joint preparatory work towards an
international HLT standards oriented initiative.
The
overall goals of ISLE are to support HLT and national projects, and HLT
industry in general, by developing, disseminating and promoting widely agreed
and urgently needed HLT de facto standards and guidelines, for infrastructural
language resources, for the tools that exploit them, and for HLT products. The
areas currently targeted by ISLE are: multilingual computational lexicons,
natural interaction and multimodality, and evaluation of HLT systems.
A
feature of EAGLES/ISLE work is the close interaction between industry and
academia, users and providers, founders and beneficiaries.
ISLE has
continued to make substantial progress in its three spheres of interest. In
multilingual lexicons, specification of the Multilingual Isle Lexical Entry
(MILE) has been completed, a prototype tool to manage MILE-conformant lexicons
has been developed and exemplary lexical resources and sense-tagged corpora
have been produced. In evaluation of HLT systems, user feedback obtained
through a major international workshop has led to more refined versions of the
ISLE evaluation framework and of a framework for classifying machine
translation evaluations. In natural interaction and multimodality (NIMM), major
surveys have been completed of resources, annotation schemes and tools, and
metadata descriptions and tools. Prototype tools have been developed to
annotate NIMM data. XML schemas have been developed to handle ISLE metadata
descriptions, and tools to allow editing, browsing and search of these
descriptions, including across distributed resources. Annotation schemes have
been devised for gestural and discourse data, and exemplary annotated corpora
have been produced and exhaustively checked for accuracy. All three areas have
produced guidelines and recommendations for best practice that have been
evolved through feedback from industry, academia, government agencies and
language resource providers. As a whole, the project has been active in
dissemination and awareness activities on an international scale.
In this
period, work towards concrete recommendations for bilingual lexical entries was
finalised. Work towards developing MILE has been oriented towards the needs of
several key HLT applications: MT, cross-language information retrieval,
cross-language information extraction, multilingual language generation,
multilingual authoring and speech-to-speech translation. It is based on
previous EAGLES work on monolingual lexicon standards.
Initially,
ISLE carried out a major survey of important bi- and
multilingual lexical resources, followed by a study and classification of
lexicographic sense indicators, then started work towards recommendations for MILE bilingual dictionary entries . Work was
firstly focussed on development of a prototype entry for a complex
Italian-English word pair. This work has led, in this period, to further
testing and refinement of the representation for MILE, and has resulted in a
representation that is now shareable and distributed, and that is less complex
than previously, but still rich enough to handle complex cases and complex
information, while remaining intuitively expressive and learnable for
lexicographers. A number of new monolingual (English, French, Italian) and
bilingual entries (English/Italian, Italian/French) was written to test the
suitability of the revised representation, covering a range of major parts of
speech and also collocations. This has involved intensive exploitation of
available resources such as SIMPLE (based on previous EAGLES
recommendations), COMLEX , the British
National Corpus and SENSEVAL-2 results. In addition, an exemplary corpus was
annotated with MILE entries, in collaboration with SENSEVAL-2, and the results
evaluated.
More
specifically, further work was carried out to finalise the ISLE lexical
recommendations, which relate to several components of the ISLE Lexical Model,
that together make up a lexicographic environment to produce and manage
ISLE-conformant lexicons:
i)
The MILE Entry Skeleton, an Entity-Relationship model to define the
general constraints for the construction of multilingual entries, together with
the grammar required to build the entire array of lexical elements needed for a
given lexical description. It defines thus the MILE Shared Lexical Objects.
Objects are identified by URI and may be incorporated in RDF encodings of
lexical representations. The use of such objects by lexicon or application
developers allows the building of lexical data at a high level of abstraction,
thus enabling MILE recommendations to be used straightforwardly by simplifying
them through abstraction and further enabling ease of reuse and maintenance.
ii)
The MILE Shared Lexical Objects consist of three types of object.
1.
The MILE Lexical Classes: These are the main building blocks of lexical
entries. They formalise the MILE basic notions, and thus represent notions such
as semantic unit, syntactic feature, syntactic frame, semantic predicate,
semantic relation, etc.
2.
The MILE Lexical Data Categories: These are instances of the MILE
Lexical Classes, uniquely identified by a URI. Core Lexical Data Categories
belong to shared repositories (Lexical Data Category Registry), user-defined
Lexical Data Categories allow for representation of user-specific or
language-specific lexical objects.
3.
The MILE Lexical Operations: Currently, basic operations and conditions
allow establishing of multilingual
links and macro-semantic lexical conceptual templates that aid in specifying
constraints on the encoding of semantic units. The initial set of basic
operations identified allows the specification of multilingual transfer tests
and actions, e.g. by enabling the addition of new information in transfer or by
constraining the realisation of a target translation.
iii)
The ISLE Lexicographic Station, a prototype tool to manage computational
lexicons modelled according to ISLE recommendations. This is in fact a
development platform that automatically generates a prototype tool starting
from the MILE DTD. This tool is further intended to exemplify the MILE entry,
to import SGML data from existing monolingual lexical resources and to allow
testing of the MILE guidelines in real scenarios. Work was also finalised on
another tool developed to browse
sense indicators extracted from machine readable bilingual dictionaries, to
support the MILE-based lexicographer interested in establishing sense
indicators of bilingual transfer conditions either automatically, through
contextual behaviour, or manually, through inference on the basis of browsing
the sense indicator database.
In
addition, work was carried out to determine how the current MILE responds to
the needs of spoken language and multimodal lexicons, giving rise to a draft
version of the ISLE Spoken Language Reference Microstructure, and to the need
to represent multiword expressions and other distributional phenomena.
Collaboration with Asian colleagues (extra contractual) during this period
enabled evaluation of the model in respect of the needs of Asian languages and
extension of it to cover particularly their morphological needs. The MILE
proposal was further refined through evaluation by ENABLER members and was also
evaluated for use in industrial applications.
Meetings
were held in Barcelona (Europe only) and Las Palmas. A panel was held at LREC
2002, Las Palmas, to discuss short and medium terms requirements for standards
for multilingual lexicons and content management systems.
The
focus of work on evaluation is on methods and metrics for Machine Translation
(MT) (earlier EAGLES work had looked at other application areas). This has
involved investigation of the various published evaluations of MT systems that
have been carried out since 1979. However, this work is not being pursued in
isolation, as MT is being used as a case study, to enable the later development
of: a general theory about the methodology for evaluating HLT applications; and
a general framework that can accommodate existing evaluation measures for
specific HLT applications. A second version of a specific framework
for classifying MT evaluations has been elaborated, illustrating how the
current state of the ISLE evaluation methodology can be applied, and has been
further refined and further populated with individual evaluation measures, and
associated criteria for the application of each measure. This work builds on
ISO 9126 and ISO 14698.
The
recommendations on evaluation were finalised. A substantial synthetic article
based on these has been submitted for publication in Machine Translation.
Extra-contractually, a workshop will be held in February 2003 in Los Angeles.
As a
result of 2001 workshops, permission was obtained to scan, correct and
disseminate an important, but hard to obtain, 1979 report on MT, the Van Slype Report
. Also, work was completed on a small corpus of translations that served as
input for the LREC workshop mentioned under Awareness.
NIMM is
concerned with extending previous EAGLES work to cater for the needs of more
than textual and spoken language resources, given the growing importance of
natural interaction with information systems and the multimodal nature of such
interaction, involving multimedia.
In this
period, surveys of NIMM data resources, annotation schemes and tools, and metadata aspects of multimodal language
resources were completed.
Work was
completed on guidelines for the creation of NIMM data resources and for the
creation of NIMM annotation schemes. Exemplary annotated corpora have been
produced to both test the schemes and to demonstrate how they can be used:
·
Experiments were carried out on annotation studies of gestures, leading
to a new version of the ISLE FORM annotation scheme for gesture, the first
scheme that both encodes gesture information kinematically and that can be used
on pre-existing video data. Various corpora have been produced, amounting to
approximately 1.5 hours of FORM annotated video, covering
Prep-Stroke-Retraction, motion-capture data and also annotation of natural
chimpanzee gestures. Inter-annotator agreement has been good. Annotations are hand-checked
for accuracy and the most recent material has been double annotated for
accuracy.
·
Annotation of discourse issues revelant to spoken language discourse led
to the production of several annotated corpora, where concepts, dialogue acts
and initiatives, and misunderstandings have been annotated.
Gesture
and discourse annotated corpora will be made available through the LDC in early
2003, due to the slightly later finishing date of the US part of the ISLE
projet.
Also,
work was completed on guidelines for the meta-description of NIMM data
resources (IMDI: the ISLE Metadata Initiative) and on an associated demonstration site that shows how the
guidelines and associated tools for meta-data description (editor, browser,
search tool) can be applied.
Further
work on tools saw new releases of the ISLE AGTK component suite. These components
allow building of tools for the annotation of linguistic signals, including
time-series, for any kind of linguistic behaviour (e.g. audio, video, text) and
are based on annotation graphs.
ISLE,
given its standards-oriented profile, is committed to dissemination activities
with a view to engendering and enhancing consensus regarding its
recommendations and guidelines. This year has seen ISLE also further developing
contacts with countries in Asia and the Americas. Steps have already been taken
to formalise cooperation with Asian countries. ISLE is also involved in
cooperation with ISO TC37/SC4 to consolidate and extend the ISLE results.
Events
that project members took part in included:
An ISLE
panel on Standards and Best Practice for Multilingual Computational Lexicons
was organised for LREC 2002, Las Palmas, involving also non-ISLE panellists.
An ISLE
workshop on MILE was held in Pisa, Italy, bringing together ISLE members and
also representatives of industry and academia from several countries outside
Europe and the USA, notably from Asia. MILE proposals were disseminated in
advance and discussed at he workshop. Complementary perspectives and feedback
were obtained via invited presentations from neighbouring communities (semantic
web, ontologies, industrial multilingual technologies), and plans laid for the
future of standardisation activities in the form of an Open Distributed Lexical
Infrastructure. Multilingual lexicons were seen to be not only important in
themselves, in relation to the traditional range of HLT applications, but of
critical importance for the successful construction and exploitation of
ontologies and semantic web resources and applications, where these depended on
language-based data.
MILE
work involved cooperation and synergies with the following projects:
·
PAROLE/SIMPLE (enlarged in 9 national projects): main input
·
(Euro)WordNet: provided input
XMELLT (NSF): provided input
·
OLIF: received (and provided) input
·
SALT: complementary work
·
ENABLER: validation (and received input)
·
ELSNET: validation
·
SENSEVAL: validation
·
ISLE NIMM WG: for metadata for computational lexicons (also with the US
OLAC)
INTERA: production and description of language resources
SI-TAL and CLIPS: received input
An ISLE workshop
on machine translation evaluation was organised at LREC 2002, to aid in
elaboration of the final recommendations on evaluation. This took the form of
practical evaluation exercises where participants chose 2 ISLE-inspired metrics
and supplied a 3rd of their own choosing, and applied these to
supplied data, following the ISLE evaluation framework. Results were then
discussed, enabling refinement of the framework. Participants came from industry, academia, government agencies,
national and international organisations, and from several countries outside
Europe, indicating the interest in ISLE evaluation work.
Work on
NIMM involved close cooperation with the projects OLAC, INTERA, ECHO, NITE,
CLASS and E-MELD (participation in International E-MELD Workshop, Ann
Arbor/Ypsilanti, USA) and with MPEG7, IMS and ISO 82045 initiatives.
In
particular, NIMM working group deliverables form an important basis for NITE
work, and also cooperation within the Open Language Archives Community (OLAC)
project has continued to generate best practice guidelines for resources. ISLE
members organised a Workshop
on Open Language Archives in December 2002 which led to revision of the
three proposed OLAC standards for Metadata, Process and Protocol and to further
recommendations for best practice. Discussions were held with potential data
providers with a view to creating a European Language Resource Metadata Domain.
An ISLE
workshop on IMDI Metadata was held in Nijmegen, as was a training course on IMDI, and an ISLE
Workshop on Multimodal Resources and Multimodal Systems Evaluation was
organised for LREC 2002.
NIMM
working group members participated in the Workshop on Embodied Conversational
Agents at the First Int. Joint. Conf. on Autonomous Agents & Multi-Agent
Systems (AAMAS’02), Bologna and in a follow-up meeting at the same event to
discuss definition of a standard for language representation for embodied
conversational agents. NIMM results were presented to Alcatel Research
Laboratories, Stuttgart, July 2002; to a RNRT project on multimodal interfaces
for interactive TV and to a LIMSI internal project on multilingual corpora.
An ISLE
workshop on Dialogue
Tagging for Multimodal Human Computer Interaction was held in Edinburgh in
December, 2002, to explore how progress could be achieved more rapidly by
identifying specific reference tasks to which particular dialogue tags are
relevant, rather than concentrating on developing a standard tagset. It became
clear that a large discourse tagged corpus for task-oriented dialogues would
benefit the community and this is a goal for future activities.
Publications by the project in 2002 include:
A.
Computational Lexicons Working Group (CLWG)
Atkins,
S., Bel, N., Bertagna, F., Bouillon, P., Calzolari, N., Fellbaum, C., Grishman,
R., Lenci, A., MacLeod, C., Palmer, M., Thurmair, G., Villegas, M., Zampolli,
A. (2002), From Resources to Applications. Designing the Multilingual ISLE
Lexical Entry, in Proceedings of LREC 2002, Las Palmas, Canary Islands, Spain,
2002.
Atkins,
S. and Bouillon, P., Relevance in Dictionary-Making, in Proceedings of “I
simposi International de Lexicografia”, Publicacions de l'Institut Universitari
de Lingüística Aplicada, Universitat Pompeu Fabra, to appear in 2002.
Calzolari,
N., Grishman, R., Palmer, M. (2002), Standards and best practice for
multilingual computational lexicon: ISLE MILE…and more, Panel held during the
LREC 2002 Conference.
Kingsbury,
P., Palmer, M., Marcus, M. (2002), Adding Predicate Argument Structure to the
Penn TreeBank, The Second International Human Language Technology Conference,
HLT-02, San Diego, CA, March 24-27, 2002.
Kingsbury,
P. and Palmer, M. (2002), From TreeBank to PropBank, in Proceedings of LREC
2002, Las Palmas, Canary Islands, Spain, 2002.
Villegas,
M. and Bel, N. (2002), From DTDs to relational dBs. An automatic generation of
a lexicographical station out off ISLE guidelines, in Proceedings of LREC 2002,
Las Palmas, Canary Islands, Spain, 2002.
Calzolari,
N., Lenci, A., Bertagna, F., Zampolli, A. (2002), Broadening the Scope of the
EAGLES/ISLE Standardization Initiative, in Proceedings of the 3rd Workshop on “Asian
Language Resources and International Standardization”, COLING 2002, Taipei (Taiwan),
31st August 2002.
Calzolari,
N., Lenci, A., Quochi, V. (2002), Towards Multiword and Multilingual Lexicons:
between Theory and Practice.
Dang, H.
T. and Palmer, M., Combining Contextual Features for Word Sense Disambiguation,
SIGLEX Workshop on “Word Sense Disambiguation”, held in conjunction with the 40th
Meeting of the Association for Computational Linguistics, ACL 2002,
Philadelphia (PA), 7th-12th July 2002.
Lenci,
A., Calzolari, N., Zampolli, A. (2002), From Text to Content: Computational
Lexicons and the Semantic Web, in Proceedings of the AAAI 2002 Workshop on
“Semantic Web Meets Language Resources”, Edmonton (Canada), 28th July 2002.
Lenci,
A. and Ide, N. (2002), The MILE Lexical Model: Linguistic and Formal
Architecture, ISLE/EAGLES Workshop “MILE (Multilingual ISLE Lexical Entry) as a
Step towards Sharable Multilingual Resources”, Pisa (Italy), 2nd December 2002.
Palmer,
M., Dang, H. T., Fellbaum, C., Making Fine-grained and Coarse-grained Sense
Distinctions, both Manually and Automatically, Journal of Natural Language
Engineering, revisions due in March 2003.LREC 2002 Workshop Publications
B. Evaluation
Working Group (EWG)
Hovy,
E., King, M. and Popescu-Belis, A.: Computer-aided Specification of Quality
Models for Machine Translation Evaluation. LREC 2002.
El Hadi,
W. M., Timimi, I. and Dabbadie, M.: Terminological Enrichment for
Non-Interactive MT Evaluation. LREC 2002.
Rajman,
M. and Hartley, A.: Automatic Ranking of MT Systems. LREC 2002.
Vanni,
M. and Miller, K. J.: Scaling the ISLE Framework: Use of Existing Corpus
Resources for Validation of MT Evaluation Metrics across Languages. LREC 2002.
Hovy,
E., King, M. and Popescu-Belis, A.: An Introduction to MT Evaluation.
Dabbadie,
M., Hartley, A., King, M., Miller, K. J., El Hadi, W. M., Popescu-Belis, A.,
Reeder, F. and Vanni, M.: A Hands-on Study of the Reliability and Coherence of
Evaluation Metrics.
Popescu-Belis,
A., King, M. and Benantar, H.: Towards a Corpus of Corrected Human
Translations.
C. Natural
Interaction and Multimodality Working Group (NIMMWG)
European
members:
Bernsen,
N. O.: Multimodality in language and speech systems - from theory to design
support tool. Chapter to appear in Granström, B. (Ed.): Multimodality in Language and Speech Systems. Dordrecht: Kluwer
Academic Publishers 2002 (to appear).
Bevacqua,
E., Pelachaud, C.:Coarticulation Model for MPEG-4 Facial Model. Submitted to
Computer Animation and Social Agents, May 2003.
Broeder,
D.and Hellwig, B.: Metadata Principles and Tools. DOBES Workshop. Nijmegen, May 2002
Broeder,
D. and Offenga, F. and Willems, D.: Metadata Tools Supporting Controlled
Vocabulary Services. Proceedings of
the LREC 2002 Conference. Las Palmas, May 2002
Broeder,
D., Wittenburg, P. and Declerck, T.: LREP: A Language Repository Exchange
Protocol. Proceedings of the LREC
2002 Conference. Las Palmas, May 2002
Brugman,
H., Levinson, S., Skiba, R., and Wittenburg, P. The DOBES Archive: Its Purpose
and Implementation. Proceedings of
the International Workshop on Resources and Tools in Field Linguistics. Las
Palmas, May 2002.
Busine,
S., Abrilian, S., Rendu, C., Martin, J.-C.: Towards Experimental Specification
and Evaluation of Lifelike Multimodal Behavior. Proceedings of the Workshop on
"Embodied conversational agents - let's specify and evaluate them!".
Marriot, A., Pelachaud, C., Rist, T., Ruttkay, S., Vilhjalmsson, H. (Eds.) pp
42-28. http://www.vhml.org/workshops/AAMAS
in conjunction with The First International Joint Conference on Autonomous
Agents & Multi-Agent Systems, 16 July, 2002, Bologna, Italy.
De
Carolis, B., Carofiglio, V., Bilvi, M. and Pelachaud, C.: APML, a Mark-up
Language for Believable Behavior Generation. In workshop “Embodied
conversational agents - let's specify and evaluate them!”, associated with
First International Joint Conference on
Autonomous Agents & Multi-Agent Systems, Bologna, Italy, July 2002.
Dybkjær,
L. and Bernsen, N.O.: Natural Interactivity Resources - Data, Annotation
Schemes and Tools. Proceedings of the Third
International Conference on Language Resources and Evaluation (LREC’2002),
Las Palmas, May 2002, 349-356.
Dybkjær,
L. and Bernsen, N. O.: Data Resources and Annotation Schemes for Natural
Interactivity: Purposes and Needs. Proceedings of the LREC’2002 Workshop on Multimodal Resources and Multimodal Systems
Evaluation, Las Palmas, May 2002, 1-7.
Dybkjær,
L. and Bernsen, N. O.: Data, Annotation Schemes and Coding Tools for Natural
Interactivity. Proceedings of ICSLP’2002,
Denver, USA, 2002.
Dybkjær,
L. and Bernsen, N. O.: Tagging Communication Problems in Spoken Dialogue
Systems: On-line or Off-line? Proceedings of ISLE Workshop on Dialogue Tagging for
Multi-modal Human Computer Interaction, organized
by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of
Edinburgh, Edinburgh, Scotland, U.K
Guirardello-Damian,
R. and Skiba, R.: Trumai Corpus: an
Example of Presenting Multi-Media Data in the IMDI-Browser. Proceedings of the International Workshop on Resources and Tools
in Field Linguistics. Las Palmas, May 2002.
Hartmann,
B., Mancini, M. and Pelachaud, C.: Formational parameters and adaptive
prototype instantiation for MPEG-4 compliant gesture synthesis. Computer
Animation, Geneva, June 2002.
Martin,
J.-C.: Measuring Cooperations Between Modalities In Human Multimodal Behavior.
Proceedings of the 4th International Conference on Methods and Techniques in
Behavioral Research (MB'2002), Amsterdam, The Netherlands, 27-30 August 2002. (http://www.noldus.com/events/mb2002http://www.noldus.webaxxs.net/events/mb2002/program/sig_4.html)
Martin,
J.-C.: On the Use of the Multimodal Clues in Observed Human Behavior for the
Modeling of Agent Cooperative Behavior. Papers from the AAAI Workshop on
Autonomy, Delegation, and Control: From Inter-agent to Groups. Eighteenth
National Conference on Artificial Intelligence (AAAI-2002). Edmonton, Alberta,
Canada, July 28. Technical report WS-02-03. AAAI Press. ISBN 1-577735-156-8. (http://csce.uark.edu/~hexmoor/AAAI-02/AAAI-02-cfp.htm p 39-43.)
Martin,
J.-C. and Kipp, M. (2002) Annotating and Measuring Multimodal Behaviour -
Tycoon Metrics in the Anvil Tool. Proceedings of the 3rd International
Conference on Language Resources and Evaluation (LREC'2002), Las Palmas, Canary
Islands, Spain, 29-31 may 2002. http://www.lrec-conf.org/lrec2002/index.html
Martin,
J.-C., Réty, J.H., Bensimon, N. (2002). Multimodal and Adaptative Pedagogical
Resources. Proceedings of the 3rd International Conference on Language
Resources and Evaluation (LREC'2002), Las Palmas, Canary Islands, Spain, 29-31
may 2002. http://www.lrec-conf.org/lrec2002/index.html
Pelachaud,
C.: Visual Text-to-Speech. In MPEG4
Facial Animation - The standard, implementations and applications, Igor S.
Pandzic, Robert Forchheimer (eds.), John Wiley & Sons, 2002.
Pelachaud,
C., Carofiglio, V., De Carolis, B., and de Rosis, F. : Embodied Contextual
Agent in Information Delivering Application. First International Joint
Conference on Autonomous Agents &
Multi-Agent Systems, Bologne, July 2002.
Pelachaud,
C. and Poggi, I.: Multimodal Embodied Agents. The Knowledge Engineering review,
to appear 2002.
Pelachaud,
C. and Poggi, I.: Subtleties of Facial Expressions in Embodied Agents. Journal
of Visualization and Computer Animation, to appear 2002.
Poggi,
I.: Gesture, gaze and Tough: Literal and Indirect Meaning, virtual symposium on
the multimodality of human communication: Theories, problems and applications. http://www.semioticon.com/virtuals/virtual_index.html
, 2002.
Rist,
T., Schmitt, M., Pelachaud, C., and Bilvi, M.: Towards a Simulation of
Conversations with Affective Embodied Speakers and Listeners. Submitted to
Computer Animation and Social Agents, May 2003.
de
Rosis, F., Pelachaud, C. , Poggi, I., Carofiglio, V., and De Carolis, N.:
Modeling the Dynamics of Affective States in a
Conversational Embodied Agent. Special Issue on “Applications of
Affective Computing in Human-Computer Interaction”, The International Journal
of Human-Computer Studies, to appear 2002.
Skiba,
R. and Brom, N.: Corpus Integration. DOBES
Workshop. Nijmegen, May 2002
Skiba,
R., Brugman, H. , Broeder, D. and Wittenburg, P.: Corpus Organization and
Access in Field Linguistics at the MPI. Proceedings
of the LREC 2002 Conference. Las Palmas, May 2002
Wittenburg,
P.: Corpus construction using Metadata Descriptions. University Utrecht,
January 2002.
Wittenburg,
P.: Metadata - Future Perspectives and ISO Tasks. ISO TC37/SC4 Foundation Meeting. Las Palmas, May 2002
Wittenburg,
P. and Broeder, D.: Management of Language Resources with Metadata. Workshop on International Standards of Terminology and Language Resources
Management. Las Palmas, May 2002
Wittenburg,,
P. and Broeder, D.: Metadata Overview and the Semantic Web. Proceedings of the
International Workshop on Resources and Tools in Field Linguistics. Las Palmas,
May 2002.
Wittenburg,
P., Broeder, D., Offenga, F. and Willems, D.: Metadata Set and Tools for
Multimedia/Multimodal Language Resources.
Workshop on Multimodal Resources
and Multimodal Systems Evaluation. Las Palmas, May 2002
Wittenburg,
P. , Mosel, U. and Dwyer, A.: Methods of Language Documentation in the DOBES
Program. Proceedings of the LREC 2002 Conference. Las Palmas, May 2002
Wittenburg,
P., Peters, W. and Broeder, D.: Metadata Proposals for Corpora and Lexica. Proceedings of the LREC 2002
Conference. Las Palmas, May 2002
US members:
Aberdeen, John, (2002). “Semantic Annotation for
Misunderstanding Detection in DARPA Communicator Dialogues.” ISLE
Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction,
organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University
of Edinburgh, Edinburgh, Scotland, U.K.
Bird, Steven, (2002). "Getting involved in the
Open Language Archives Community". Third
International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain.
Bird, Steven, Kazuaki
Maeda, Xiaoyi Ma, Haejoong Lee,
Beth Randall and Salim Zayat, (2002).
"TableTrans, MultiTrans, InterTrans and TreeTrans: Diverse Tools Built on
the Annotation Graph Toolkit". Third
International Conference on Language Resources and Evaluation (LREC), Las
Palmas, Canary Islands, Spain.
Bird, Steven and Gary Simmons (2002). “Seven
Dimensions of Portability for Language Documentation and Description”. Workshop on Portability Issues in Human
Language Technologies, Third International Conference on Language Resources and
Evaluation, Paris, France, European Language Resources Association.
Cieri, Christopher and Mark Liberman, (2002).
"TIDES Language Resources: A Data Map for Translingual Information
Access". Third International Conference on Language Resources and Evaluation
(LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.
Cieri, Christopher and Stephanie Strassel, (2002).
"The DASL Project: a Case Study in Data Re-Annotation and Re-Use". Third International Conference on Language
Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands,
Spain.
Core, Mark, (2002). “Predicting Success of a Tutorial
Dialogue”. ISLE Workshop on Dialogue Tagging
for Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Cotton, Scott
and Steven Bird, (2002). "An Integrated Framework for Treebanks and
Multilayer Annotations". Third International Conference on Language
Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain.
Devillers, Laurance, (2002). “Annotation and detection
of emotion in a task-oriented Human-Human dialog corpus.” ISLE
Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction,
organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University
of Edinburgh, Edinburgh, Scotland, U.K.
Di Eugenio, Barbara, (2002). “Tagging tutoring dialogues to generate feedback in intelligent tutoring
systems.” ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer
Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17,
2002, University of Edinburgh, Edinburgh, Scotland, U.K.
Doran, Christy, (2002). “Dialogue Act Annotation of
DARPA Communicator Dialogues.” ISLE Workshop on Dialogue Tagging for
Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Dybkjær, Laila, (2002). “Tagging communication
problems in spoken dialogue systems: online or.off-line.” ISLE
Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction,
organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University
of Edinburgh, Edinburgh, Scotland, U.K.
Hastie, Helen, (2002). “Performing Automatic
Evaluation of Spoken Language Systems without Accessing System Logfiles.” ISLE
Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction,
organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University
of Edinburgh, Edinburgh, Scotland, U.K.
Hastie, Helen, Rashmi Prasad and Marilyn Walker
(2002). “What’s the Trouble: Automatically Identifying Problematic Dialogs in
DARPA Communicator Dialog Systems”. 40th
Annual Meeting of the Association of Computational Linguistics,
Philadelphia, PA, USA.
Hastie, Helen, Marilyn Walker and Rashmi Prasad,
(2002). “Using a DATE Dialogue Act Tagger for User Satisfaction and Task
Completion Prediction”. Language Resources and Evaluation Conference
, 2002, organized by 2002,
Jordan, Pam, (2002). “A tag-set for the generation of
nominals in collaborative dialogue.” ISLE Workshop on Dialogue Tagging for
Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Loper, Edward and Steven Bird (2002). “NLTK: The
Natural Language Toolkit. Edward Loper and Steven Bird”. ACL Workshop on Effective Tools and Methodologies for Teaching Natural
Language Processing and Computational Linguistics, 40th Annual
Meeting of the Association for Computational Linguistics, Philadelphia, PA,
USA.
Ma, Xiaoyi , Haejoong
Lee, Steven Bird and Kazuaki
Maeda, (2002). "Models and Tools for Collaborative Annotation". Third International Conference on Language
Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands,
Spain.
Maeda, Kazuaki , Steven Bird, Xiaoyi Ma and Haejoong Lee, (2002). "Creating
Annotation Tools with the Annotation Graph Toolkit". Conference on
Language Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary
Islands, Spain.
Martell, Craig, (2002). "The FORM Gesture
Annotation System". Third
International Conference on Language Resources and Evaluation (LREC):
Multimodal Resources and Multimodal
Systems Evaluation Workshop, June 1, Las Palmas, Canary Islands,
Spain.
Martell, Craig, (2002). "FORM: An Extensible,
Kinematically-based Gesture Annotation Scheme". Third International Conference on Language Resources and Evaluation
(LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.
Martell, Craig (2002). "A Scheme to Capture the
Kinematics of Gesture-poster". 7th
International Conference on Spoken Language Processing (ICSLIP-2002),,
Denver, Colorado.
Martell, Craig, (2002). "Using FORM for NIMM
(poster)". International CLASS
Workshop on Natural, Intelligent, and Effective Interaction in Multimodal
Dialogue Systems, June 28-29, 2002, Copenhagen, Denmark.
Maxwell, Mike, (2002). "Resources for Morphology
Learning and Evaluation: A Morphological Glossing Assistant". Conference on Language Resources and
Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.
Narayanan, Shrikanth, (2002). “Towards modeling user
behavior in human-machine interactions: Effect of Errors, Uncertainty and
Emotions.” ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer
Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17,
2002, University of Edinburgh, Edinburgh, Scotland, U.K.
Piwek, Paul, (2002). “A Rich Representation Language
for the Description of Agent Behaviour in NECA.” ISLE Workshop on Dialogue
Tagging for Multi-modal Human Computer Interaction, organized by Marilyn
Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh,
Edinburgh, Scotland, U.K.
Prasad, Rashmi and Marilyn Walker (2002). “Training a
Dialogue Act Tagger for Human-Human and Human-Computer Travel Dialogues”. 3rd SIGdial Workshop on Discourse
and Dialog, 2002, Philadelphia, PA, USA.
Rambow, Owen, (2002). “A Dependency Treebank for
English”. ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer
Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17,
2002, University of Edinburgh, Edinburgh, Scotland, U.K.
Rambow, Owen, Cassandre Creswell, Rachel Szekely,
Harriet Taber and Marilyn Walker (2002). “A Dependency Treebank for English”. Third International Conference on Language
Resources and Evaluation.
Rossett, Sophie, (2002). “Representing Dialog
Progression for Dynamic State Assessment”.
ISLE Workshop on Dialogue Tagging
for Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Rudnicky, Alex, (2002). “Annotation for Stochastic
Generation.” ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer
Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17,
2002, University of Edinburgh, Edinburgh, Scotland, U.K.
Scott, Donia, (2002). “Pragmatic Congruence through
Language Specific Mappings from Semantics to Syntax.” ISLE Workshop on Dialogue
Tagging for Multi-modal Human Computer Interaction, organized by Marilyn
Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh,
Edinburgh, Scotland, U.K.
Soria, Claudia, (2002). “Dialogue tagging for general
purposes: the Adam and Avip projects.” ISLE Workshop on Dialogue Tagging for
Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Strassel, Stephanie
and Christopher Cieri, (2002). "Resource Development for Topic
Detection and Tracking Research: The TDT-4 CorpusThird International Conference on Language Resources and Evaluation
(LREC), May 29 - 30, Las Palmas,
Canary Islands, Spain.
Walker, Marilyn, (2002). “Comparing Communicator
Dialogue Systems using DATE”. ISLE Workshop on Dialogue Tagging for
Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Walker, Marilyn and Rebecca Passonneau, (2002). “DATE:
A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems.” Human
Language Technology Conference, organized by March, 2001,
Webb, Nick, (2002). “Multi-layer Dialogue Annotation
for Automated Multilingual Customer Service.”
ISLE Workshop on Dialogue Tagging
for Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Whittaker, Steven, (2002). “Discourse Tagging in
Computer Mediated Communication: Identifying and Addressing Problems with
Videoconferencing”. ISLE Workshop on Dialogue Tagging for
Multi-modal Human Computer Interaction, organized by Marilyn Walker and
Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh,
Scotland, U.K.
Whittaker, Steven, Marilyn Walker and Johanna Moore
(2002). “Fish or Fowl: A Wizard of Oz Evaluation of Dialogue Strategies in the
Restaurant Domain”. Language Resources
and Evaluation Conference. Las Palmas.
Further
contact and dissemination activities are planned to ensure the widest possible
feedback on the guidelines and to increase the level of cooperation with other
countries. All three areas of ISLE will continue to remain active in pursuing
goals that have been identified during the project, which have led to
wide-reaching engagement of ISLE members with resource collection and
annotation projects, and standardisation and best-practice initiatives, on a
world-wide basis.
The EAGLES Secretariat welcomes
feedback and enquiries regarding the work of ISLE.
EAGLES Secretariat
CNR - Consiglio Nazionale delle Ricerche
ILC - Istituto di Linguistica Computazionale
Area della Ricerca di Pisa San Cataldo
Via Moruzzi N° 1
56124 Pisa
ITALY
Phone: [+39] 050 315 2873
Fax 1: [+39] 050 315 2834
Fax 2: [+39] 050 315 2839
ISLE
reports are placed on the ISLE web server as they receive approval
for dissemination (i.e. as they are considered to represent a consensus view).
Further details of the project together with links to earlier EAGLES work may
also be found at this location.