MUSI addresses the problem of multilingual summarisation to facilitate access to information retrieval systems and, in particular, to the content of electronic documents over the Internet. As a first phase of a more global service, the project will concentrate in providing summarisation facilities for English and Italian (input languages) towards German and French. As a part of the exploitation, a full coverage of these languages plus others will be done.
The project will reuse and integrate existing linguistic tools (analysers, lexicons, grammars, text generators) to develop concept-based summarisation tools able to provide a much better quality than the one provided by the existing systems. In particular, MUSI will rely on technologies developed in past European projects (PAROLE, MULINEX, GRAAL, SIMPLE, SPARKLE and GENELEX).
The achievements of the project will be measured through an information retrieval application to be provided by an end-user identified by a consulting business that is a client of the technical co-ordinator.
MUSI directly addresses the market of information retrieval and filtering on the Internet but will also consider other opportunities such as e-commerce (customer information).
The MUSI consortium is made of three organisations:
- ILC (Italy), a well-known institute of Computational Linguistic based in Pisa, which will ensure the financial co-ordination of the project;
- LexiQuest [formerly ERLI] (France), a company specialised in the development and marketing of linguistic solutions, which will be responsible for the technical co-ordination of the project and ensure commercial exploitation of the results;
- DFKI (Germany), a non-profit contract research institute working in the field of innovative software development and Language Technology on Information and Knowledge Management (web-based multilingual information systems).
MUSI will provide a tool for multilingual summarisation initiated for four languages: English and Italian towards German and French. This tool will be promoted by DFKI and ILC while used by LexiQuest for its commercial development.