Ontologies
Ontology-based Feature Selection: A Survey
Sikelis, Konstantinos, Tsekouras, George E, Kotis, Konstantinos I
The Semantic Web emerged as an extension to the traditional Web, towards adding meaning to a distributed Web of structured and linked data. At its core, the concept of ontology provides the means to semantically describe and structure information and data and expose it to software and human agents in a machine and human-readable form. For software agents to be realized, it is crucial to develop powerful artificial intelligence and machine learning techniques, able to extract knowledge from information and data sources and represent it in the underlying ontology. This survey aims to provide insight into key aspects of ontology-based knowledge extraction, from various sources such as text, images, databases and human expertise, with emphasis on the task of feature selection. First, some of the most common classification and feature selection algorithms are briefly presented. Then, selected methodologies, which utilize ontologies to represent features and perform feature selection and classification, are described. The presented examples span diverse application domains, e.g., medicine, tourism, mechanical and civil engineering, and demonstrate the feasibility and applicability of such methods.
Statistics versus Machine Learning: should they really be opposed?
This "seemingly" old debate deserves to be revisited with fresh perspective. Even though the field of application is fairly recent, the basic methods used in Data Science are for the most part some forty years old now. To recall, the two main branches concerned are statistics on the one hand, and machine learning on the other, to which I would add a third branch that consists of what could be called "business ontologies" i.e. "structured sets of terms and concepts representing business know-how or a field of application" (Wikipedia). We can notice that some people are absorbed by the versus debate, comparing the statistical and machine learning approaches, and their efficiency, ROI and cost within the context of predictive applications (predictive marketing, Digital Marketing, customer knowledge, etc.). This debate is by no means a new one in the sense that the two "schools" sprang from two different intellectual trends.
Document Structure aware Relational Graph Convolutional Networks for Ontology Population
Shalghar, Abhay M, Kumar, Ayush, Ganesan, Balaji, Kannan, Aswin, G, Shobha
Ontologies comprising of concepts, their attributes, and relationships, form the quintessential backbone of many knowledge based AI systems. These systems manifest in the form of question-answering or dialogue in number of business analytics and master data management applications. While there have been efforts towards populating domain specific ontologies, we examine the role of document structure in learning ontological relationships between concepts in any document corpus. Inspired by ideas from hypernym discovery and explainability, our method performs about 15 points more accurate than a stand-alone R-GCN model for this task.
Knowledge Triggering, Extraction and Storage via Human-Robot Verbal Interaction
Grassi, Lucrezia, Recchiuto, Carmine Tommaso, Sgorbissa, Antonio
This article describes a novel approach to expand in run-time the knowledge base of an Artificial Conversational Agent. A technique for automatic knowledge extraction from the user's sentence and four methods to insert the new acquired concepts in the knowledge base have been developed and integrated into a system that has already been tested for knowledge-based conversation between a social humanoid robot and residents of care homes. The run-time addition of new knowledge allows overcoming some limitations that affect most robots and chatbots: the incapability of engaging the user for a long time due to the restricted number of conversation topics. The insertion in the knowledge base of new concepts recognized in the user's sentence is expected to result in a wider range of topics that can be covered during an interaction, making the conversation less repetitive. Two experiments are presented to assess the performance of the knowledge extraction technique, and the efficiency of the developed insertion methods when adding several concepts in the Ontology.
ROC: An Ontology for Country Responses towards COVID-19
Qundus, Jamal Al, Schäfermeier, Ralph, Karam, Naouel, Peikert, Silvio, Paschke, Adrian
The ROC ontology for country responses to COVID-19 provides a model for collecting, linking and sharing data on the COVID-19 pandemic. It follows semantic standardization (W3C standards RDF, OWL, SPARQL) for the representation of concepts and creation of vocabularies. ROC focuses on country measures and enables the integration of data from heterogeneous data sources. The proposed ontology is intended to facilitate statistical analysis to study and evaluate the effectiveness and side effects of government responses to COVID-19 in different countries. The ontology contains data collected by OxCGRT from publicly available information. This data has been compiled from information provided by ECDC for most countries, as well as from various repositories used to collect data on COVID-19.
Instance-Level Update in DL-Lite Ontologies through First-Order Rewriting
De Giacomo, Giuseppe (Sapienza University of Rome) | Oriol, Xavier (Universitat Politècnica de Catalunya) | Rosati, Riccardo (Sapienza University of Rome) | Savo, Domenico Fabio (Università degli Studi di Bergamo)
In this paper we study instance-level update in DL-LiteA , a well-known description logic that influenced the OWL 2 QL standard. Instance-level update regards insertions and deletions in the ABox of an ontology. In particular we focus on formula-based approaches to instance-level update. We show that DL-LiteA , which is well-known for enjoying first-order rewritability of query answering, enjoys a first-order rewritability property also for instance-level update. That is, every update can be reformulated into a set of insertion and deletion instructions computable through a non-recursive Datalog program with negation. Such a program is readily translatable into a first-order query over the ABox considered as a database, and hence into SQL. By exploiting this result, we implement an update component for DL-LiteA-based systems and perform some experiments showing that the approach works in practice.
INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]
Amer-Yahia, Sihem, Koutrika, Georgia, Bastian, Frederic, Belmpas, Theofilos, Braschler, Martin, Brunner, Ursin, Calvanese, Diego, Fabricius, Maximilian, Gkini, Orest, Kosten, Catherine, Lanti, Davide, Litke, Antonis, Lücke-Tieke, Hendrik, Massucci, Francesco Alessandro, de Farias, Tarcisio Mendes, Mosca, Alessandro, Multari, Francesco, Papadakis, Nikolaos, Papadopoulos, Dimitris, Patil, Yogendra, Personnaz, Aurélien, Rull, Guillem, Sima, Ana, Smith, Ellery, Skoutas, Dimitrios, Subramanian, Srividya, Xiao, Guohui, Stockinger, Kurt
A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data exploration system -- that leverages, on the one hand, Machine Learning and, on the other hand, semantics for the purpose of Data Management (DM). Our vision is to develop a classic unified, comprehensive platform that provides extensive access to open datasets, and we demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics. INODE offers sustainable services in (a) data modeling and linking, (b) integrated query processing using natural language, (c) guidance, and (d) data exploration through visualization, thus facilitating the user in discovering new insights. We demonstrate that our system is uniquely accessible to a wide range of users from larger scientific communities to the public. Finally, we briefly illustrate how this work paves the way for new research opportunities in DM.
Revisiting Indirect Ontology Alignment : New Challenging Issues in Cross-Lingual Context
Ontology alignment process is overwhelmingly cited in Knowledge Engineering as a key mechanism aimed at bypassing heterogeneity and reconciling various data sources, represented by ontologies, i.e., the the Semantic Web cornerstone. In such infrastructures and environments, it is inconceivable to assume that all ontologies covering a particular domain of knowledge are aligned in pairs. Moreover, the high performance of alignment approaches is closely related to two factors, i.e., time consumption and machine resource limitations. Thus, good quality alignments are valuable and it would be appropriate to exploit them. Based on this observation, this article introduces a new method of indirect alignment of ontologies in a cross-lingual context. Indeed, the proposed method deals with alignments of multilingual ontologies and implements an indirect ontology alignment strategy based on a composition and reuse of effective direct alignments. The trigger of the proposed method process is based on alignment algebra which governs the semantics composition of relationships and confidence values. The obtained results, after a thorough and detailed experiment are very encouraging and highlight many positive aspects about the new proposed method.
Learning Description Logic Ontologies. Five Approaches. Where Do They Stand?
The quest for acquiring a formal representation of the knowledge of a domain of interest has attracted researchers with various backgrounds into a diverse field called ontology learning. We highlight classical machine learning and data mining approaches that have been proposed for (semi-)automating the creation of description logic (DL) ontologies. These are based on association rule mining, formal concept analysis, inductive logic programming, computational learning theory, and neural networks. We provide an overview of each approach and how it has been adapted for dealing with DL ontologies. Finally, we discuss the benefits and limitations of each of them for learning DL ontologies.
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles
Salatino, Angelo A., Osborne, Francesco, Thanapalasingam, Thiviyan, Motta, Enrico
Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this paper, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of re-search areas in the field of Computer Science. The CSO Classifier takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology. The approach was evaluated on a gold standard of manually annotated articles yielding a significant improvement over alternative methods.