IMPaCTS is the first Italian fully automatically created corpus of 1,444,160 original-simplified sentence pairs. For each input sentence, the corpus provides multiple simplifications with multiple levels of readability, and is automatically annotated with readability scores and a rich set of linguistic features. The data cover two domains: Wikipedia and administrative texts.
IMPaCTS has been successfully used to fine-tune neural language models for readability-controlled sentence simplification.
