KNOWLEDGE EXTRACTION TOOLS

T2K2: Text-To-Knowledge

It is a platform for the automatic extraction of linguistic and domain-specific information from document collections. It offers a structured organisation of the extracted knowledge and indexes the analysed texts against the extracted information. It depends on a number of tools for Natural Language Processing (NLP), statistical text analysis and machine learning, which are dynamically integrated to offer an accurate representation of linguistic information and domain- specific content of English and Italian text corpora in different domains.

READ-IT: Assessing Readability of Italian Texts

It is the first advanced readability assessment tool for Italian. It combines traditional raw textual aspects with lexical, morpho-syntactic and syntactic information. In READ-IT, readability assessment is carried out with respect to both documents and sentences. The second type of assessment represents the important novelty of the proposed approach, creating the prerequisites for aligning the readability assessment with the process of text simplification.

PANACEA WebServices

They are services developed within the framework of the European project called ‘PANACEA’ and hosted at CNR- ILC. They enable the automatic construction of language resources and offer format converters, pos-taggers, dependency parsers, lexical acquisition tools (MultiWord and subcategorisation extractors, lexicon mergers). Tutorials for using these services and composing workflows.