{"id":10771,"date":"2022-07-25T08:39:55","date_gmt":"2022-07-25T06:39:55","guid":{"rendered":"https:\/\/www.ilc.cnr.it\/resources\/"},"modified":"2023-05-19T10:54:19","modified_gmt":"2023-05-19T08:54:19","slug":"resources","status":"publish","type":"page","link":"https:\/\/www.ilc.cnr.it\/en\/resources\/","title":{"rendered":"RESOURCES"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\"><strong>Annotated corpora<\/strong><\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">ISST-TANL Corpus<\/h3>\n\n\n\n<p>It is a manually annotated corpus, encoded in the standard CoNLL format and including PoS tagging and syntactic dependency annotation. Jointly developed by <strong>CNR-ILC<\/strong> and University of Pisa, it exemplifies the general use of the language and consists of articles extracted from newspapers and periodicals, selected to cover a high variety of topics. This corpus was used for training and testing in the shared activity &#8220;Domain Adaptation for Dependency Analysis&#8221; of EVALITA 2011.<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Unannotated corpora<\/strong><\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">CLIC<\/h3>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h1 class=\"wp-block-heading\">Lexica<\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">PAROLE-SIMPLE-CLIPS<\/h3>\n\n\n\n<p>It is a four-level general-purpose lexicon that has been developed in three different projects. The morphological and syntactic lexicon core was built within the European project &#8220;Preparatory Action for the Organisation of Language Resources for Language Engineering&#8221; (LE-PAROLE). The language model and the semantic lexicon core were developed within the European project &#8220;Semantic Information for Multifunctional Multilingual Lexicons&#8221; (LE-SIMPLE). The phonological level of description and the extent of lexical coverage were produced in the context of the Italian project &#8220;Corpora e Lessici dell&#8217;Italiano Parlato e Scritto&#8221; (CLIPS). It comprises a total of 387,267 phonetic units, 53,044 morphological units (53,044 lemmas), 37,406 syntactic units (28,111 lemmas) and 28,346 semantic units (19,216 lemmas). It has been semantically coded in full compliance with the international standards specified in the PAROLE-SIMPLE model and based on EAGLES. The syntactic and semantic encoding was carried out in collaboration with Thamus (Consortium for Multilingual Documentary Engineering), which is responsible for 25,000 additional entries.<\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">SIMPLE LOD<\/h3>\n\n\n\n<p>It is the RDF serialisation of all nouns extracted from the PAROLE-SIMPLE-CLIPS lexicon. Lexical entries are serialised in Lemon, while semantic relations are modelled according to SIMPLE&#8217;s OWL.<\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">ItalWordNet LOD<\/h3>\n\n\n\n<p><strong><a href=\"http:\/\/datahub.io\/dataset\/iwn\" target=\"_blank\" rel=\"noreferrer noopener\">datahub<\/a><\/strong>; <a href=\"http:\/\/www.languagelibrary.eu\/owl\/italWordNet15\/schema\/synset\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>ilc<\/strong><\/a><\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">FrameNet<\/h3>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">GeoDomainWordNet<\/h3>\n\n\n\n<p><strong><a href=\"\/\/datahub.io\/dataset\/geodomainwn\" target=\"_blank\" rel=\"noreferrer noopener\">datahub<\/a><\/strong>; <strong><a href=\"http:\/\/www.languagelibrary.eu\/owl\/geodomainWN\/eng\/geonames-synset\" target=\"_blank\" rel=\"noreferrer noopener\">ILC for English<\/a><\/strong>; <strong><a href=\"http:\/\/www.languagelibrary.eu\/owl\/geodomainWN\/ita\/geonames-synset\" target=\"_blank\" rel=\"noreferrer noopener\">ILC for Italian<\/a> <\/strong>\r\nThe concepts of the GeoNames ontology, with their English labels and glosses, in Italian have been transformed into a WordNet-like resource, and have been duly linked to the generic WordNets of both languages. This resource is published in RDF in accordance with the W3C and the Lemon schema.<\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>AncientGreekWordNet LOD<\/strong><\/h3>\n\n\n\n<p>Linked open data related to the &#8216;AncientGreekWordNet&#8217; section of CoPhiWordNet.<\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Sentiment Lexicon LOD<\/h3>\n\n\n\n<p>The <a href=\"https:\/\/github.com\/opener-project\/public-sentiment-lexicons\/tree\/master\/propagation_lexicons\/it\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Italian Sentiment Lexicon<\/strong><\/a> (in LMF format) was developed semi-automatically by ItalWordNet from a manually checked list of 1,000 keywords. It contains 24,293 lexical entries annotated with positive\/negative\/neutral polarity.<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h1 class=\"wp-block-heading\"><strong>Domain Terminologies<\/strong><\/h1>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"http:\/\/www.imagact.it\/imagact\/query\/dictionary.seam\" target=\"_blank\" rel=\"noreferrer noopener\">IMAG-Act<\/a><\/h3>\n\n\n\n<p>It is an interlingual action ontology. Using speech corpora, 1,010 high-frequency action concepts were identified and visually represented with prototypical scenes. The ontology allows the definition of interlingual correspondences between verbs and actions in English, Italian, Chinese and Spanish. Thanks to the visual representation of the identified action concepts, IMAG-Act can potentially be extended to any language.<\/p>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">FiscalDB<\/h3>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">SindacDB<\/h3>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Mariterm<\/h3>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Biolessico<\/h3>\n\n\n\n<div style=\"height:13px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Ontologies<\/h3>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Other resources<\/h1>\n\n\n\n<p>The <strong><a href=\"https:\/\/ilc4clarin.ilc.cnr.it\/en\/\" target=\"_blank\" rel=\"noreferrer noopener\">ILC4CLARIN<\/a><\/strong> repository hosts a constantly updated collection of language resources developed by the <strong>Cnr-Istituto di Linguistica Computazionale &#8220;Antonio Zampolli&#8221;<\/strong>. These resources are deposited and made available in accordance with the FAIR (Findable, Accessible, Interoperable, Reusable) principles.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-layout-1 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/dspace-clarin-it.ilc.cnr.it\/repository\/xmlui\/discover?filtertype=branding&amp;filter_relational_operator=equals&amp;filter=ILC\" target=\"_blank\" rel=\"noreferrer noopener\">BROWSE THE COLLECTION<\/a><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Annotated corpora ISST-TANL Corpus It is a manually annotated corpus, encoded in the standard CoNLL format and including PoS tagging&hellip;<\/p>\n","protected":false},"author":3,"featured_media":10092,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"acf":[],"jetpack_sharing_enabled":true,"publishpress_future_action":{"enabled":false,"date":"2026-04-15 01:33:31","action":"change-status","newStatus":"draft","terms":[],"taxonomy":"translation_priority"},"publishpress_future_workflow_manual_trigger":{"enabledWorkflows":[]},"_links":{"self":[{"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/pages\/10771"}],"collection":[{"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/comments?post=10771"}],"version-history":[{"count":16,"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/pages\/10771\/revisions"}],"predecessor-version":[{"id":14875,"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/pages\/10771\/revisions\/14875"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/media\/10092"}],"wp:attachment":[{"href":"https:\/\/www.ilc.cnr.it\/en\/wp-json\/wp\/v2\/media?parent=10771"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}