The project LingUistic Complexity Evaluation in educaTion (LuCET), funded by the Italian Ministry of Education, University and Research as a Project of National Relevant Interest, investigates Linguistic Complexity (LC) in Italian and its relationship with user-based Processing Difficulty (PD).
Combining expertise from theoretical, computational and applied linguistics, psycholinguistics, pedagogy and psychometrics, the project is the first attempt to offer a comprehensive and multi-level set of measures and procedures to assess LC and to provide a systematic account of its relationship with PD.
The study will focus on students attending the last grade of secondary school, which represent a less investigated population with respect to school-age children. In particular, different populations of students of grade 13 (approximately corresponding to ages 18-19) will be considered, with typical and atypical developmental profiles, and with Italian as a first or additional language.
LuCET aims at establishing a virtuous circle between, on the one hand, fundamental, interdisciplinary, knowledge-oriented research on LC/PD and, on the other hand, application-oriented research, which will provide more appropriate measures of Language Complexity and Processing Difficulty and their applications to educational settings.
LuCET will propose theoretical and operational definitions of LC in the domains of lexicon, morphology, and syntax and will develop systematic procedures to assess LC in written productions by students from different populations. It will also produce experimental protocols (e.g. comprehension, eye-tracking, repetition tasks) to investigate how varying degrees of LC across linguistic levels affect receptive processing in the same populations.
The challenge of offering a multi-level and comprehensive definition of LC will be met by resorting to Natural Language Processing (NLP) methods and techniques based on machine learning algorithms (including deep learning), used to extract and weight a wide range of LC features from real language usage (both students’ productions and texts of varying LC degrees) to computationally model LC and PD.
The potential of multi-level models and metrics of LC/PD plays a key role in pedagogical and psychometric studies focusing on language skills. LuCET will investigate whether and to what extent a better understanding of the interplay between LC and PD may inform effective teaching practices.
LC/PD models will also be applied on data from a large-scale assessment framework, i.e., the INVALSI National Assessment of reading comprehension.
The main aim will be to investigate the role of text features, including LC, task demands, and respondent skills in predicting the probability of solving comprehension items, thus creating a connection between cognitive models of reading comprehension and item psychometric properties.