COMPUTATIONAL LINGUISTICS

The goal of Computational Linguistics (CL) is to develop models of human language functioning that can be translated into computer-executable programmes and enable the computer to understand and communicate in any known language, written or spoken, used on a daily basis or handed down through manuscripts, inscriptions and other indirect evidence.

In general, we speak of Automatic Language Processing – ALP (or Natural Language Processing – NLP) to refer to the set of data, algorithms and technologies aimed at this goal.

Main applications

The application potential of NLP tools has made it possible to develop software platforms that work with human language to use it for a specific linguistic task, such as machine translation and text or speech comprehension.

In recent years, these technologies and application tools have made great progress in terms of accuracy and ease of use and have been integrated into widely used and commercialised systems, with great social and economic impact, for example:

  • virtual assistants;
  • search engines;
  • machine translation systems.

The link between this field of Computational Linguistics (LC) and recent technological developments related to the spread of Artificial Intelligence (AI) is evident, especially regarding the optimization of the communication interaction between the human user and an automatic service or device.

The study of texts has traditionally made use of automatic text and language processing tools. Since the beginnings of LC as an autonomous scientific discipline in the first half of the 1950s, the automatic production of concordances (the set of local contexts in which the words of a text or a set of texts recur) has been a fundamentally important tool for textual criticism. Similarly, the automatic counting of word frequency allowed to analyse the distribution of words in large textual corpora, revealing:

  • the quantitative traces of a style;
  • the prevailing content of a document through its lexical footprint;
  • the characteristic frequency distribution of linguistic features of a specific literary genre.

Over the last few years, as these technologies have progressed, the basic tools have evolved to the point of being used for tasks that require linguistic intelligence and ecdotic sensitivity, such as:

  • the integration and interpretation of the corrupted text of a manuscript;
  • automatic handwriting recognition.

This area of LC is deeply linked to purposes such as textual and philological analysis or, more generally, to the interdisciplinary field known as Humanities Informatics – UI (or Digital Humanities – DH).

Computational modelling of language comprehension or production has often been seen as a means to explore fundamental theoretical questions in both Linguistics and Psycholinguistics. From this perspective, the questions that the computational linguist asks are the same as those of the linguist or psycholinguist:

  • how does language work?
  • how is language learned?
  • how does it change over time, communicative situations or domains?

The assumption is that, by building a computational model of a linguistic process, a better understanding of the phenomenon can be achieved. Much of frontier Linguistics and Psycholinguistics today makes extensive use of LC techniques and models in this sense. This approach, for example, can lead to:

  • decipher a still unknown ancient language;
  • studying how two languages have changed over time from the same stock;
  • understand how the mental lexicon works in our brain and what can alter its functioning.

Main application sectors

In addition to the advancement of research in the Social Sciences and Humanities (SSH), collaboration between computational linguists and specialists from other disciplines enables the development of innovative methods and technologies that can be applied in many strategic fields, including:

  • Cultural Heritage;
  • Sustainable Tourism;
  • Education and Training;
  • Digital Administration;
  • Digital Justice;
  • Digital Health;
  • Enterprise and Innovation;
  • Third sector and inclusion

Computational Linguistics between Artificial Intelligence and the Social Sciences and Humanities: scientific and technological results, open challenges and application impacts.

Chap. 10 of Cnr-Dipartimento Scienze Umane e Sociali, Patrimonio CulturaleLe scienze umane, sociali e del patrimonio culturale nell’era delle grandi transizioni” report.