Goto

Collaborating Authors

 Ronzano, Francesco


Towards Ontology-Enhanced Representation Learning for Large Language Models

arXiv.org Artificial Intelligence

Taking advantage of the widespread use of ontologies to organise and harmonize knowledge across several distinct domains, this paper proposes a novel approach to improve an embedding-Large Language Model (embedding-LLM) of interest by infusing the knowledge formalized by a reference ontology: ontological knowledge infusion aims at boosting the ability of the considered LLM to effectively model the knowledge domain described by the infused ontology. The linguistic information (i.e. concept synonyms and descriptions) and structural information (i.e. is-a relations) formalized by the ontology are utilized to compile a comprehensive set of concept definitions, with the assistance of a powerful generative LLM (i.e. GPT-3.5-turbo). These concept definitions are then employed to fine-tune the target embedding-LLM using a contrastive learning framework. To demonstrate and evaluate the proposed approach, we utilize the biomedical disease ontology MONDO. The results show that embedding-LLMs enhanced by ontological disease knowledge exhibit an improved capability to effectively evaluate the similarity of in-domain sentences from biomedical documents mentioning diseases, without compromising their out-of-domain performance.


ExTaSem! Extending, Taxonomizing and Semantifying Domain Terminologies

AAAI Conferences

We introduce ExTaSem!, a novel approach for the automatic learning of lexical taxonomies from domain terminologies. First, we exploit a very large semantic network to collect housands of in-domain textual definitions. Second, we extract (hyponym, hypernym) pairs from each definition with a CRF-based algorithm trained on manually-validated data. Finally, we introduce a graph induction procedure which constructs a full-fledged taxonomy where each edge is weighted according to its domain pertinence. ExTaSem! achieves state-of-the-art results in the following taxonomy evaluation experiments: (1) Hypernym discovery, (2) Reconstructing gold standard taxonomies, and (3) Taxonomy quality according to structural measures. We release weighted taxonomies for six domains for the use and scrutiny of the community.


Do We Criticise (and Laugh) in the Same Way? Automatic Detection of Multi-Lingual Satirical News in Twitter

AAAI Conferences

During the last few years, the investigation of methodologies to automatically detect and characterise the figurative traits of textual contents has attracted a growing interest. Indeed, the capability to correctly deal with figurative language and more specifically with satire is fundamental to build robust approaches in several sub-fields of Artificial Intelligence including Sentiment Analysis and Affective Computing. In this paper we investigate the automatic detection of Tweets that advertise satirical news in English, Spanish and Italian. To this purpose we present a system that models Tweets from different languages by a set of language independent features that describe lexical, semantic and usage-related properties of the words of each Tweet. We approach the satire identification problem as binary classification of Tweets as satirical or not satirical messages. We test the performance of our system by performing experiments of both monolingual and cross-language classifications, evaluating the satire detection effectiveness of our features.Our system outperforms a word-based baseline and it is able to recognise if a news in Twitter is satirical or not with good accuracy. Moreover, we analyse the behaviour of the system across the different languages, obtaining interesting results.