Type of project: European | Start date: 01/03/2025 | End date: 29/02/2028
The LLMs4EU project, coordinated by the Alliance for Language Technologies (ALT-EDIC), aims to preserve European linguistic and cultural diversity in the digital age through cooperation between economic and academic actors. Indeed, some European languages are threatened to be left aside from generative AI development due to the lack of resources to train language models.
The project brings together Europe’s leading players in the field of generative AI to ensure that European companies and especially SMEs have access to the tools and resources to become competitive regarding language technologies and especially Large Language Models (LLMs). The goal is to make LLMs and all the tools necessary for their exploitation in all EU languages available in open data by capitalizing on existing European programs and competencies. The tools that will be made accessible to European companies will cover all the steps from training LLMs to ensuring their conformity to European legislation (AI Act, GDPR, etc.).
The consortium created around ALT-EDIC includes organizations working in more than 20 countries, which ensures good geographical and linguistic coverage. The project will develop different relevant use cases to demonstrate the capacity of European actors to work together to create adapted tools for different economic sectors, and the coverage of all EU languages will be ensured through the creation and acquisition of the necessary datasets by the project.
CNR-ILC’s role in LLMs4EU encompasses contributions to the science use case, specifically in data and model documentation, and evaluation. This includes work on data collection and infrastructure, defining requirements for language technology tools, developing efficient fine-tuning and adaptation techniques for models, and establishing robust LLM evaluation methodologies, particularly for human evaluation. In relation to CNR-ILC’s involvement with the ALT-EDIC initiative, Cnr-Istituto di Linguistica Computazionale “Antonio Zampolli” (CNR-ILC) is also involved in ensuring transparent, traceable data practices throughout the LLM lifecycle, including data governance and metadata standards.

Acronym:
LLMs4EU
Funding programme:
Digital Europe Programme (DIGITAL)
Funding body:
European Union
Status:
Ongoing
CNR-ILC role:
Beneficiary
Project coordinator:
Alliance pour les technologies des langues (ATL-EDIC)
CNR-ILC Research Unit Chair:
Francesca Frontini
Staff:
Simonetta Montemagni
Anas Fahad Khan
Riccardo Del Gratta
Valeria Quochi
Felice Dell’Orletta
Giulia Venturi
Dominique Pierina Brunato
Alessandro Enea
Paola Baroni
Noemi Terreni
Sara Goggi
Website/s:
https://www.alt-edic.eu/projects/llms4eu/
https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/projects-details/43152860/101198470
