ISLE

International Standards
for Language Engineering


http://www.ilc.pi.cnr.it/EAGLES96/isle/ISLE_Home_Page.htm

 
 
 
 
 
 
 

Annual Report 2002

 

ISLE is the latest in a series of projects under the successful EAGLES initiative (Expert Advisory Group for Language Engineering Standards). It extends hitherto EU-based EAGLES work within the EU-US international research cooperation framework, set up as a result of two years of joint preparatory work towards an international HLT standards oriented initiative.

The overall goals of ISLE are to support HLT and national projects, and HLT industry in general, by developing, disseminating and promoting widely agreed and urgently needed HLT de facto standards and guidelines, for infrastructural language resources, for the tools that exploit them, and for HLT products. The areas currently targeted by ISLE are: multilingual computational lexicons, natural interaction and multimodality, and evaluation of HLT systems.

A feature of EAGLES/ISLE work is the close interaction between industry and academia, users and providers, founders and beneficiaries.
 

Summary of 2002 Activities

ISLE has continued to make substantial progress in its three spheres of interest. In multilingual lexicons, specification of the Multilingual Isle Lexical Entry (MILE) has been completed, a prototype tool to manage MILE-conformant lexicons has been developed and exemplary lexical resources and sense-tagged corpora have been produced. In evaluation of HLT systems, user feedback obtained through a major international workshop has led to more refined versions of the ISLE evaluation framework and of a framework for classifying machine translation evaluations. In natural interaction and multimodality (NIMM), major surveys have been completed of resources, annotation schemes and tools, and metadata descriptions and tools. Prototype tools have been developed to annotate NIMM data. XML schemas have been developed to handle ISLE metadata descriptions, and tools to allow editing, browsing and search of these descriptions, including across distributed resources. Annotation schemes have been devised for gestural and discourse data, and exemplary annotated corpora have been produced and exhaustively checked for accuracy. All three areas have produced guidelines and recommendations for best practice that have been evolved through feedback from industry, academia, government agencies and language resource providers. As a whole, the project has been active in dissemination and awareness activities on an international scale.
 

Multilingual Computational Lexicons

In this period, work towards concrete recommendations for bilingual lexical entries was finalised. Work towards developing MILE has been oriented towards the needs of several key HLT applications: MT, cross-language information retrieval, cross-language information extraction, multilingual language generation, multilingual authoring and speech-to-speech translation. It is based on previous EAGLES work on monolingual lexicon standards.

Initially, ISLE carried out a major survey of important bi- and multilingual lexical resources, followed by a study and classification of lexicographic sense indicators, then started work towards recommendations for MILE bilingual dictionary entries . Work was firstly focussed on development of a prototype entry for a complex Italian-English word pair. This work has led, in this period, to further testing and refinement of the representation for MILE, and has resulted in a representation that is now shareable and distributed, and that is less complex than previously, but still rich enough to handle complex cases and complex information, while remaining intuitively expressive and learnable for lexicographers. A number of new monolingual (English, French, Italian) and bilingual entries (English/Italian, Italian/French) was written to test the suitability of the revised representation, covering a range of major parts of speech and also collocations. This has involved intensive exploitation of available resources such as SIMPLE (based on previous EAGLES recommendations), COMLEX , the British National Corpus and SENSEVAL-2 results. In addition, an exemplary corpus was annotated with MILE entries, in collaboration with SENSEVAL-2, and the results evaluated.

More specifically, further work was carried out to finalise the ISLE lexical recommendations, which relate to several components of the ISLE Lexical Model, that together make up a lexicographic environment to produce and manage ISLE-conformant lexicons:

i)                   The MILE Entry Skeleton, an Entity-Relationship model to define the general constraints for the construction of multilingual entries, together with the grammar required to build the entire array of lexical elements needed for a given lexical description. It defines thus the MILE Shared Lexical Objects. Objects are identified by URI and may be incorporated in RDF encodings of lexical representations. The use of such objects by lexicon or application developers allows the building of lexical data at a high level of abstraction, thus enabling MILE recommendations to be used straightforwardly by simplifying them through abstraction and further enabling ease of reuse and maintenance.

ii)                 The MILE Shared Lexical Objects consist of three types of object.

1.     The MILE Lexical Classes: These are the main building blocks of lexical entries. They formalise the MILE basic notions, and thus represent notions such as semantic unit, syntactic feature, syntactic frame, semantic predicate, semantic relation, etc.

2.     The MILE Lexical Data Categories: These are instances of the MILE Lexical Classes, uniquely identified by a URI. Core Lexical Data Categories belong to shared repositories (Lexical Data Category Registry), user-defined Lexical Data Categories allow for representation of user-specific or language-specific lexical objects.

3.     The MILE Lexical Operations: Currently, basic operations and conditions allow establishing of  multilingual links and macro-semantic lexical conceptual templates that aid in specifying constraints on the encoding of semantic units. The initial set of basic operations identified allows the specification of multilingual transfer tests and actions, e.g. by enabling the addition of new information in transfer or by constraining the realisation of a target translation.

iii)               The ISLE Lexicographic Station, a prototype tool to manage computational lexicons modelled according to ISLE recommendations. This is in fact a development platform that automatically generates a prototype tool starting from the MILE DTD. This tool is further intended to exemplify the MILE entry, to import SGML data from existing monolingual lexical resources and to allow testing of the MILE guidelines in real scenarios. Work was also finalised on another tool developed to browse sense indicators extracted from machine readable bilingual dictionaries, to support the MILE-based lexicographer interested in establishing sense indicators of bilingual transfer conditions either automatically, through contextual behaviour, or manually, through inference on the basis of browsing the sense indicator database.

In addition, work was carried out to determine how the current MILE responds to the needs of spoken language and multimodal lexicons, giving rise to a draft version of the ISLE Spoken Language Reference Microstructure, and to the need to represent multiword expressions and other distributional phenomena. Collaboration with Asian colleagues (extra contractual) during this period enabled evaluation of the model in respect of the needs of Asian languages and extension of it to cover particularly their morphological needs. The MILE proposal was further refined through evaluation by ENABLER members and was also evaluated for use in industrial applications.

Meetings were held in Barcelona (Europe only) and Las Palmas. A panel was held at LREC 2002, Las Palmas, to discuss short and medium terms requirements for standards for multilingual lexicons and content management systems.

Evaluation of HLT Systems

The focus of work on evaluation is on methods and metrics for Machine Translation (MT) (earlier EAGLES work had looked at other application areas). This has involved investigation of the various published evaluations of MT systems that have been carried out since 1979. However, this work is not being pursued in isolation, as MT is being used as a case study, to enable the later development of: a general theory about the methodology for evaluating HLT applications; and a general framework that can accommodate existing evaluation measures for specific HLT applications. A second version of a specific framework for classifying MT evaluations has been elaborated, illustrating how the current state of the ISLE evaluation methodology can be applied, and has been further refined and further populated with individual evaluation measures, and associated criteria for the application of each measure. This work builds on ISO 9126 and ISO 14698.

The recommendations on evaluation were finalised. A substantial synthetic article based on these has been submitted for publication in Machine Translation. Extra-contractually, a workshop will be held in February 2003 in Los Angeles.

As a result of 2001 workshops, permission was obtained to scan, correct and disseminate an important, but hard to obtain, 1979 report on MT, the Van Slype Report . Also, work was completed on a small corpus of translations that served as input for the LREC workshop mentioned under Awareness.

Natural Interaction and Multimodality

NIMM is concerned with extending previous EAGLES work to cater for the needs of more than textual and spoken language resources, given the growing importance of natural interaction with information systems and the multimodal nature of such interaction, involving multimedia.

In this period, surveys of NIMM data resources, annotation schemes and tools,  and metadata aspects of multimodal language resources were completed.

Work was completed on guidelines for the creation of NIMM data resources and for the creation of NIMM annotation schemes. Exemplary annotated corpora have been produced to both test the schemes and to demonstrate how they can be used:

·        Experiments were carried out on annotation studies of gestures, leading to a new version of the ISLE FORM annotation scheme for gesture, the first scheme that both encodes gesture information kinematically and that can be used on pre-existing video data. Various corpora have been produced, amounting to approximately 1.5 hours of FORM annotated video, covering Prep-Stroke-Retraction, motion-capture data and also annotation of natural chimpanzee gestures. Inter-annotator agreement has been good. Annotations are hand-checked for accuracy and the most recent material has been double annotated for accuracy.

·        Annotation of discourse issues revelant to spoken language discourse led to the production of several annotated corpora, where concepts, dialogue acts and initiatives, and misunderstandings have been annotated.

Gesture and discourse annotated corpora will be made available through the LDC in early 2003, due to the slightly later finishing date of the US part of the ISLE projet.

Also, work was completed on guidelines for the meta-description of NIMM data resources (IMDI: the ISLE Metadata Initiative) and on an associated demonstration site that shows how the guidelines and associated tools for meta-data description (editor, browser, search tool) can be applied.

Further work on tools saw new releases of the ISLE AGTK component suite. These components allow building of tools for the annotation of linguistic signals, including time-series, for any kind of linguistic behaviour (e.g. audio, video, text) and are based on annotation graphs.

User Group, Promotion and Awareness

ISLE, given its standards-oriented profile, is committed to dissemination activities with a view to engendering and enhancing consensus regarding its recommendations and guidelines. This year has seen ISLE also further developing contacts with countries in Asia and the Americas. Steps have already been taken to formalise cooperation with Asian countries. ISLE is also involved in cooperation with ISO TC37/SC4 to consolidate and extend the ISLE results.

Events that project members took part in included:

An ISLE panel on Standards and Best Practice for Multilingual Computational Lexicons was organised for LREC 2002, Las Palmas, involving also non-ISLE panellists.

An ISLE workshop on MILE was held in Pisa, Italy, bringing together ISLE members and also representatives of industry and academia from several countries outside Europe and the USA, notably from Asia. MILE proposals were disseminated in advance and discussed at he workshop. Complementary perspectives and feedback were obtained via invited presentations from neighbouring communities (semantic web, ontologies, industrial multilingual technologies), and plans laid for the future of standardisation activities in the form of an Open Distributed Lexical Infrastructure. Multilingual lexicons were seen to be not only important in themselves, in relation to the traditional range of HLT applications, but of critical importance for the successful construction and exploitation of ontologies and semantic web resources and applications, where these depended on language-based data.

MILE work involved cooperation and synergies with the following projects:

·        PAROLE/SIMPLE (enlarged in 9 national projects): main input

·        (Euro)WordNet: provided input

·        XMELLT (NSF): provided input

·        OLIF: received (and provided) input

·        SALT: complementary work

·        ENABLER: validation (and received input)

·        ELSNET: validation

·        SENSEVAL: validation

·        ISLE NIMM WG: for metadata for computational lexicons (also with the US OLAC)

·        INTERA: production and description of language resources

·        SI-TAL and CLIPS: received input

 

An ISLE workshop on machine translation evaluation was organised at LREC 2002, to aid in elaboration of the final recommendations on evaluation. This took the form of practical evaluation exercises where participants chose 2 ISLE-inspired metrics and supplied a 3rd of their own choosing, and applied these to supplied data, following the ISLE evaluation framework. Results were then discussed, enabling refinement of the framework.  Participants came from industry, academia, government agencies, national and international organisations, and from several countries outside Europe, indicating the interest in ISLE evaluation work.

Work on NIMM involved close cooperation with the projects OLAC, INTERA, ECHO, NITE, CLASS and E-MELD (participation in International E-MELD Workshop, Ann Arbor/Ypsilanti, USA) and with MPEG7, IMS and ISO 82045 initiatives.

In particular, NIMM working group deliverables form an important basis for NITE work, and also cooperation within the Open Language Archives Community (OLAC) project has continued to generate best practice guidelines for resources. ISLE members organised a Workshop on Open Language Archives in December 2002 which led to revision of the three proposed OLAC standards for Metadata, Process and Protocol and to further recommendations for best practice. Discussions were held with potential data providers with a view to creating a European Language Resource Metadata Domain.

An ISLE workshop on IMDI Metadata was held in Nijmegen, as was  a training course on IMDI, and an ISLE Workshop on Multimodal Resources and Multimodal Systems Evaluation was organised for LREC 2002.

NIMM working group members participated in the Workshop on Embodied Conversational Agents at the First Int. Joint. Conf. on Autonomous Agents & Multi-Agent Systems (AAMAS’02), Bologna and in a follow-up meeting at the same event to discuss definition of a standard for language representation for embodied conversational agents. NIMM results were presented to Alcatel Research Laboratories, Stuttgart, July 2002; to a RNRT project on multimodal interfaces for interactive TV and to a LIMSI internal project on multilingual corpora.

An ISLE workshop on Dialogue Tagging for Multimodal Human Computer Interaction was held in Edinburgh in December, 2002, to explore how progress could be achieved more rapidly by identifying specific reference tasks to which particular dialogue tags are relevant, rather than concentrating on developing a standard tagset. It became clear that a large discourse tagged corpus for task-oriented dialogues would benefit the community and this is a goal for future activities.


Publications by the project in 2002 include:

A. Computational Lexicons Working Group (CLWG)

Atkins, S., Bel, N., Bertagna, F., Bouillon, P., Calzolari, N., Fellbaum, C., Grishman, R., Lenci, A., MacLeod, C., Palmer, M., Thurmair, G., Villegas, M., Zampolli, A. (2002), From Resources to Applications. Designing the Multilingual ISLE Lexical Entry, in Proceedings of LREC 2002, Las Palmas, Canary Islands, Spain, 2002.

Atkins, S. and Bouillon, P., Relevance in Dictionary-Making, in Proceedings of “I simposi International de Lexicografia”, Publicacions de l'Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra, to appear in 2002.

Calzolari, N., Grishman, R., Palmer, M. (2002), Standards and best practice for multilingual computational lexicon: ISLE MILE…and more, Panel held during the LREC 2002 Conference.

Kingsbury, P., Palmer, M., Marcus, M. (2002), Adding Predicate Argument Structure to the Penn TreeBank, The Second International Human Language Technology Conference, HLT-02, San Diego, CA, March 24-27, 2002.

Kingsbury, P. and Palmer, M. (2002), From TreeBank to PropBank, in Proceedings of LREC 2002, Las Palmas, Canary Islands, Spain, 2002.

Villegas, M. and Bel, N. (2002), From DTDs to relational dBs. An automatic generation of a lexicographical station out off ISLE guidelines, in Proceedings of LREC 2002, Las Palmas, Canary Islands, Spain, 2002.

Calzolari, N., Lenci, A., Bertagna, F., Zampolli, A. (2002), Broadening the Scope of the EAGLES/ISLE Standardization Initiative, in Proceedings of the 3rd Workshop on “Asian Language Resources and International Standardization”, COLING 2002, Taipei (Taiwan), 31st August 2002.

Calzolari, N., Lenci, A., Quochi, V. (2002), Towards Multiword and Multilingual Lexicons: between Theory and Practice.

Dang, H. T. and Palmer, M., Combining Contextual Features for Word Sense Disambiguation, SIGLEX Workshop on “Word Sense Disambiguation”, held in conjunction with the 40th Meeting of the Association for Computational Linguistics, ACL 2002, Philadelphia (PA), 7th-12th July 2002.

Lenci, A., Calzolari, N., Zampolli, A. (2002), From Text to Content: Computational Lexicons and the Semantic Web, in Proceedings of the AAAI 2002 Workshop on “Semantic Web Meets Language Resources”, Edmonton (Canada), 28th July 2002.

Lenci, A. and Ide, N. (2002), The MILE Lexical Model: Linguistic and Formal Architecture, ISLE/EAGLES Workshop “MILE (Multilingual ISLE Lexical Entry) as a Step towards Sharable Multilingual Resources”, Pisa (Italy), 2nd December 2002.

Palmer, M., Dang, H. T., Fellbaum, C., Making Fine-grained and Coarse-grained Sense Distinctions, both Manually and Automatically, Journal of Natural Language Engineering, revisions due in March 2003.LREC 2002 Workshop Publications

B. Evaluation Working Group (EWG)

Hovy, E., King, M. and Popescu-Belis, A.: Computer-aided Specification of Quality Models for Machine Translation Evaluation. LREC 2002.

El Hadi, W. M., Timimi, I. and Dabbadie, M.: Terminological Enrichment for Non-Interactive MT Evaluation. LREC 2002.

Rajman, M. and Hartley, A.: Automatic Ranking of MT Systems. LREC 2002.

Vanni, M. and Miller, K. J.: Scaling the ISLE Framework: Use of Existing Corpus Resources for Validation of MT Evaluation Metrics across Languages. LREC 2002.

Hovy, E., King, M. and Popescu-Belis, A.: An Introduction to MT Evaluation.

Dabbadie, M., Hartley, A., King, M., Miller, K. J., El Hadi, W. M., Popescu-Belis, A., Reeder, F. and Vanni, M.: A Hands-on Study of the Reliability and Coherence of Evaluation Metrics.

Popescu-Belis, A., King, M. and Benantar, H.: Towards a Corpus of Corrected Human Translations.

C. Natural Interaction and Multimodality Working Group (NIMMWG)

European members:

Bernsen, N. O.: Multimodality in language and speech systems - from theory to design support tool. Chapter to appear in Granström, B. (Ed.): Multimodality in Language and Speech Systems. Dordrecht: Kluwer Academic Publishers 2002 (to appear).

Bevacqua, E., Pelachaud, C.:Coarticulation Model for MPEG-4 Facial Model. Submitted to Computer Animation and Social Agents, May 2003.

Broeder, D.and Hellwig, B.: Metadata Principles and Tools. DOBES Workshop. Nijmegen, May 2002

Broeder, D. and Offenga, F. and Willems, D.: Metadata Tools Supporting Controlled Vocabulary Services. Proceedings of the LREC 2002 Conference. Las Palmas, May 2002

Broeder, D., Wittenburg, P. and Declerck, T.: LREP: A Language Repository Exchange Protocol. Proceedings of the LREC 2002 Conference. Las Palmas, May 2002

Brugman, H., Levinson, S., Skiba, R., and Wittenburg, P. The DOBES Archive: Its Purpose and Implementation. Proceedings of the International Workshop on Resources and Tools in Field Linguistics. Las Palmas, May 2002.

Busine, S., Abrilian, S., Rendu, C., Martin, J.-C.: Towards Experimental Specification and Evaluation of Lifelike Multimodal Behavior. Proceedings of the Workshop on "Embodied conversational agents - let's specify and evaluate them!". Marriot, A., Pelachaud, C., Rist, T., Ruttkay, S., Vilhjalmsson, H. (Eds.) pp 42-28. http://www.vhml.org/workshops/AAMAS in conjunction with The First International Joint Conference on Autonomous Agents & Multi-Agent Systems, 16 July, 2002, Bologna, Italy.

De Carolis, B., Carofiglio, V., Bilvi, M. and Pelachaud, C.: APML, a Mark-up Language for Believable Behavior Generation. In workshop “Embodied conversational agents - let's specify and evaluate them!”, associated with First International Joint Conference  on Autonomous Agents & Multi-Agent Systems, Bologna, Italy, July 2002.

Dybkjær, L. and Bernsen, N.O.: Natural Interactivity Resources - Data, Annotation Schemes and Tools. Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’2002), Las Palmas, May 2002, 349-356.

Dybkjær, L. and Bernsen, N. O.: Data Resources and Annotation Schemes for Natural Interactivity: Purposes and Needs. Proceedings of the LREC’2002 Workshop on Multimodal Resources and Multimodal Systems Evaluation, Las Palmas, May 2002, 1-7.

Dybkjær, L. and Bernsen, N. O.: Data, Annotation Schemes and Coding Tools for Natural Interactivity. Proceedings of ICSLP’2002, Denver, USA, 2002.

Dybkjær, L. and Bernsen, N. O.: Tagging Communication Problems in Spoken Dialogue Systems: On-line or Off-line? Proceedings of ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K

Guirardello-Damian, R. and Skiba, R.: Trumai Corpus: an Example of Presenting Multi-Media Data in the IMDI-Browser. Proceedings of the International Workshop on Resources and Tools in Field Linguistics. Las Palmas, May 2002.

Hartmann, B., Mancini, M. and Pelachaud, C.: Formational parameters and adaptive prototype instantiation for MPEG-4 compliant gesture synthesis. Computer Animation, Geneva, June 2002.

Martin, J.-C.: Measuring Cooperations Between Modalities In Human Multimodal Behavior. Proceedings of the 4th International Conference on Methods and Techniques in Behavioral Research (MB'2002), Amsterdam, The Netherlands, 27-30 August 2002. (http://www.noldus.com/events/mb2002http://www.noldus.webaxxs.net/events/mb2002/program/sig_4.html)

Martin, J.-C.: On the Use of the Multimodal Clues in Observed Human Behavior for the Modeling of Agent Cooperative Behavior. Papers from the AAAI Workshop on Autonomy, Delegation, and Control: From Inter-agent to Groups. Eighteenth National Conference on Artificial Intelligence (AAAI-2002). Edmonton, Alberta, Canada, July 28. Technical report WS-02-03. AAAI Press. ISBN 1-577735-156-8. (http://csce.uark.edu/~hexmoor/AAAI-02/AAAI-02-cfp.htm  p 39-43.)

Martin, J.-C. and Kipp, M. (2002) Annotating and Measuring Multimodal Behaviour - Tycoon Metrics in the Anvil Tool. Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC'2002), Las Palmas, Canary Islands, Spain, 29-31 may 2002. http://www.lrec-conf.org/lrec2002/index.html

Martin, J.-C., Réty, J.H., Bensimon, N. (2002). Multimodal and Adaptative Pedagogical Resources. Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC'2002), Las Palmas, Canary Islands, Spain, 29-31 may 2002. http://www.lrec-conf.org/lrec2002/index.html

Pelachaud, C.: Visual Text-to-Speech. In MPEG4 Facial Animation - The standard, implementations and applications, Igor S. Pandzic, Robert Forchheimer (eds.), John Wiley & Sons, 2002.

Pelachaud, C., Carofiglio, V., De Carolis, B., and de Rosis, F. : Embodied Contextual Agent in Information Delivering Application. First International Joint Conference  on Autonomous Agents & Multi-Agent Systems, Bologne, July 2002.

Pelachaud, C. and Poggi, I.: Multimodal Embodied Agents. The Knowledge Engineering review, to appear 2002.

Pelachaud, C. and Poggi, I.: Subtleties of Facial Expressions in Embodied Agents. Journal of Visualization and Computer Animation, to appear 2002.

Poggi, I.: Gesture, gaze and Tough: Literal and Indirect Meaning, virtual symposium on the multimodality of human communication: Theories, problems and applications. http://www.semioticon.com/virtuals/virtual_index.html , 2002.

Rist, T., Schmitt, M., Pelachaud, C., and Bilvi, M.: Towards a Simulation of Conversations with Affective Embodied Speakers and Listeners. Submitted to Computer Animation and Social Agents, May 2003.

de Rosis, F., Pelachaud, C. , Poggi, I., Carofiglio, V., and De Carolis, N.: Modeling the Dynamics of Affective States in a  Conversational Embodied Agent. Special Issue on “Applications of Affective Computing in Human-Computer Interaction”, The International Journal of Human-Computer Studies, to appear 2002.

Skiba, R. and Brom, N.: Corpus Integration. DOBES Workshop. Nijmegen, May 2002

Skiba, R., Brugman, H. , Broeder, D. and Wittenburg, P.: Corpus Organization and Access in Field Linguistics at the MPI. Proceedings of the LREC 2002 Conference. Las Palmas, May 2002

Wittenburg, P.: Corpus construction using Metadata Descriptions. University Utrecht, January 2002.

Wittenburg, P.: Metadata - Future Perspectives and ISO Tasks. ISO TC37/SC4 Foundation Meeting. Las Palmas, May 2002

Wittenburg, P. and Broeder, D.: Management of Language Resources with Metadata. Workshop on International Standards of Terminology and Language Resources Management. Las Palmas, May 2002

Wittenburg,, P. and Broeder, D.: Metadata Overview and the Semantic Web. Proceedings of the International Workshop on Resources and Tools in Field Linguistics. Las Palmas, May 2002.

Wittenburg, P., Broeder, D., Offenga, F. and Willems, D.: Metadata Set and Tools for Multimedia/Multimodal Language Resources. Workshop on Multimodal Resources and Multimodal Systems Evaluation. Las Palmas, May 2002

Wittenburg, P. , Mosel, U. and Dwyer, A.: Methods of Language Documentation in the DOBES Program. Proceedings of the LREC 2002 Conference. Las Palmas, May 2002

Wittenburg, P., Peters, W. and Broeder, D.: Metadata Proposals for Corpora and Lexica. Proceedings of the LREC 2002 Conference. Las Palmas, May 2002

US members:

Aberdeen, John, (2002). “Semantic Annotation for Misunderstanding Detection in DARPA Communicator Dialogues.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Bird, Steven, (2002). "Getting involved in the Open Language Archives Community". Third International Conference on Language Resources and Evaluation (LREC),  Las Palmas, Canary Islands, Spain.

Bird, Steven, Kazuaki  Maeda, Xiaoyi  Ma, Haejoong Lee, Beth  Randall and Salim Zayat, (2002). "TableTrans, MultiTrans, InterTrans and TreeTrans: Diverse Tools Built on the Annotation Graph Toolkit". Third International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain.

Bird, Steven and Gary Simmons (2002). “Seven Dimensions of Portability for Language Documentation and Description”. Workshop on Portability Issues in Human Language Technologies, Third International Conference on Language Resources and Evaluation, Paris, France, European Language Resources Association.

Cieri, Christopher and Mark Liberman, (2002). "TIDES Language Resources: A Data Map for Translingual Information Access".  Third International Conference on Language Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.

Cieri, Christopher and Stephanie Strassel, (2002). "The DASL Project: a Case Study in Data Re-Annotation and Re-Use". Third International Conference on Language Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.

Core, Mark, (2002). “Predicting Success of a Tutorial Dialogue”. ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Cotton, Scott  and Steven Bird, (2002). "An Integrated Framework for Treebanks and Multilayer Annotations".  Third International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain.

Devillers, Laurance, (2002). “Annotation and detection of emotion in a task-oriented Human-Human dialog corpus.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Di Eugenio, Barbara, (2002). “Tagging tutoring dialogues to generate feedback in intelligent tutoring systems.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Doran, Christy, (2002). “Dialogue Act Annotation of DARPA Communicator Dialogues.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Dybkjær, Laila, (2002). “Tagging communication problems in spoken dialogue systems: online or.off-line.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Hastie, Helen, (2002). “Performing Automatic Evaluation of Spoken Language Systems without Accessing System Logfiles.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Hastie, Helen, Rashmi Prasad and Marilyn Walker (2002). “What’s the Trouble: Automatically Identifying Problematic Dialogs in DARPA Communicator Dialog Systems”. 40th Annual Meeting of the Association of Computational Linguistics, Philadelphia, PA, USA.

Hastie, Helen, Marilyn Walker and Rashmi Prasad, (2002). “Using a DATE Dialogue Act Tagger for User Satisfaction and Task Completion Prediction”.  Language Resources and Evaluation Conference , 2002, organized by 2002,

Jordan, Pam, (2002). “A tag-set for the generation of nominals in collaborative dialogue.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Loper, Edward and Steven Bird (2002). “NLTK: The Natural Language Toolkit. Edward Loper and Steven Bird”. ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA.

Ma, Xiaoyi , Haejoong  Lee, Steven  Bird and Kazuaki Maeda, (2002). "Models and Tools for Collaborative Annotation". Third International Conference on Language Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.

Maeda, Kazuaki , Steven Bird, Xiaoyi  Ma and Haejoong Lee, (2002). "Creating Annotation Tools with the Annotation Graph Toolkit". Conference on Language Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.

Martell, Craig, (2002). "The FORM Gesture Annotation System". Third International Conference on Language Resources and Evaluation (LREC): Multimodal Resources and Multimodal  Systems Evaluation Workshop, June 1, Las Palmas, Canary Islands, Spain.

Martell, Craig, (2002). "FORM: An Extensible, Kinematically-based Gesture Annotation Scheme". Third International Conference on Language Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.

Martell, Craig (2002). "A Scheme to Capture the Kinematics of Gesture-poster". 7th International Conference on Spoken Language Processing (ICSLIP-2002),, Denver, Colorado.

Martell, Craig, (2002). "Using FORM for NIMM (poster)". International CLASS Workshop on Natural, Intelligent, and Effective Interaction in Multimodal Dialogue Systems, June 28-29, 2002, Copenhagen, Denmark.

Maxwell, Mike, (2002). "Resources for Morphology Learning and Evaluation: A Morphological Glossing Assistant". Conference on Language Resources and Evaluation (LREC), May 29 - 30, Las Palmas, Canary Islands, Spain.

Narayanan, Shrikanth, (2002). “Towards modeling user behavior in human-machine interactions: Effect of Errors, Uncertainty and Emotions.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Piwek, Paul, (2002). “A Rich Representation Language for the Description of Agent Behaviour in NECA.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Prasad, Rashmi and Marilyn Walker (2002). “Training a Dialogue Act Tagger for Human-Human and Human-Computer Travel Dialogues”. 3rd SIGdial Workshop on Discourse and Dialog, 2002, Philadelphia, PA, USA.

Rambow, Owen, (2002). “A Dependency Treebank for English”.  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Rambow, Owen, Cassandre Creswell, Rachel Szekely, Harriet Taber and Marilyn Walker (2002). “A Dependency Treebank for English”. Third International Conference on Language Resources and Evaluation.

Rossett, Sophie, (2002). “Representing Dialog Progression for Dynamic State Assessment”.  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Rudnicky, Alex, (2002). “Annotation for Stochastic Generation.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Scott, Donia, (2002). “Pragmatic Congruence through Language Specific Mappings from Semantics to Syntax.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Soria, Claudia, (2002). “Dialogue tagging for general purposes: the Adam and Avip projects.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Strassel, Stephanie  and Christopher Cieri, (2002). "Resource Development for Topic Detection and Tracking Research: The TDT-4 CorpusThird International Conference on Language Resources and Evaluation (LREC),  May 29 - 30, Las Palmas, Canary Islands, Spain.

Walker, Marilyn, (2002). “Comparing Communicator Dialogue Systems using DATE”.  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Walker, Marilyn and Rebecca Passonneau, (2002). “DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems.”  Human Language Technology Conference, organized by March, 2001,

Webb, Nick, (2002). “Multi-layer Dialogue Annotation for Automated Multilingual Customer Service.”  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Whittaker, Steven, (2002). “Discourse Tagging in Computer Mediated Communication: Identifying and Addressing Problems with Videoconferencing”.  ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction, organized by Marilyn Walker and Helen Hastie, December 15-17, 2002, University of Edinburgh, Edinburgh, Scotland, U.K.

Whittaker, Steven, Marilyn Walker and Johanna Moore (2002). “Fish or Fowl: A Wizard of Oz Evaluation of Dialogue Strategies in the Restaurant Domain”. Language Resources and Evaluation Conference. Las Palmas.

Future Work

Further contact and dissemination activities are planned to ensure the widest possible feedback on the guidelines and to increase the level of cooperation with other countries. All three areas of ISLE will continue to remain active in pursuing goals that have been identified during the project, which have led to wide-reaching engagement of ISLE members with resource collection and annotation projects, and standardisation and best-practice initiatives, on a world-wide basis.
 

Further Information

The EAGLES Secretariat welcomes feedback and enquiries regarding the work of ISLE.

EAGLES Secretariat
CNR - Consiglio Nazionale delle Ricerche
ILC - Istituto di Linguistica Computazionale
Area della Ricerca di Pisa San Cataldo
Via Moruzzi N° 1
56124 Pisa
ITALY
Phone: [+39] 050 315 2873
Fax 1: [+39] 050 315 2834
Fax 2: [+39] 050 315 2839

ISLE reports are placed on the ISLE web server as they receive approval for dissemination (i.e. as they are considered to represent a consensus view). Further details of the project together with links to earlier EAGLES work may also be found at this location.