Trattamento Automatico di Varietà Storiche di Italiano

Type of project: Regional  |  Start date: 01/06/2020  |  End date: 31/05/2022

The project aims to make the contents of a cultural institution such as the Accademia della Crusca accessible to a wide and varied audience by offering advanced access and navigation methods equipped with “linguistic intelligence”.

In particular, the project activities are divided into two main macro-areas.

  1. Construction of computational lexicons and corpora enriched with linguistic annotation (primarily morpho-syntactic annotation and lemmatization) for historical varieties of the Italian language
    With regard to this macro-area, the project intends to address the bottleneck represented by the absence of adequate resources (lexicons and annotated corpora) for the specialization of automatic language processing tools to historical varieties of Italian. The resources developed will be used for the specialization and/or development of software components for the automatic processing of historical varieties of Italian, thanks to which it will be possible to develop advanced methods of access to cultural contents, also usable by a wide audience.
  2. Extraction and structuring of the knowledge contained in historical digital dictionaries
    In particular, the activities of this macro-line will concern the definition of an incremental strategy for structuring the contents of digitized dictionaries and its application to monumental works of Italian lexicography such as the historical dictionaries of the Italian language made available on the Web site of the Accademia della Crusca. The content structuring strategy is developed in relation to the “Great dictionary of the Italian language” founded by Salvatore Battaglia (UTET).

The two macro-areas of activity present important synergies that will be explored during the project. While on the one hand the knowledge extracted from historical dictionaries will be able to provide a valuable input for the construction of computational lexicons to be used for text annotation, on the other hand linguistic annotation tools will be able to provide support in the process of knowledge extraction and structuring from the historical dictionaries selected.

The results achieved in the two macro-areas of activity will create the conditions for the provision of innovative and advanced services for the enhancement of textual collections testifying historical varieties of the Italian language, with important repercussions for the community of scholars of the Humanities, in the educational field, as well as for cultural institutions that have to deal with digital historical texts.

TrAVaSI Banner


Funding body:
Regione Toscana (POR FSE 2014-2020 - Asse A - Priorità A.2 – Obiettivo A.2.1 – Azione A.2.1.7)

Grant agreement:
CUP B15J19001040004


CNR-ILC role:

Sebastiana Cucurullo
Francesca De Blasi
Manuel Favaro
Elisa Guadagnini
Paolo Picchi
Eva Sassolini
Noemi Terreni