Ontologies
The ERA of FOLE: Foundation
This paper discusses the representation of ontologies in the first-order logical environment {\ttfamily FOLE}. An ontology defines the primitives with which to model the knowledge resources for a community of discourse. These primitives consist of classes, relationships and properties. An ontology uses formal axioms to constrain the interpretation of these primitives. In short, an ontology specifies a logical theory. This paper continues the discussion of the representation and interpretation of ontologies in the first-order logical environment {\ttfamily FOLE}. The formalism and semantics of (many-sorted) first-order logic can be developed in both a \emph{classification form} and an \emph{interpretation form}. Two papers, the current paper, defining the concept of a structure, and ``The {\ttfamily ERA} of {\ttfamily FOLE}: Superstructure'', defining the concept of a sound logic, represent the \emph{classification form}, corresponding to ideas discussed in the ``Information Flow Framework''. Two papers, ``The {\ttfamily FOLE} Table'', defining the concept of a relational table, and ``The {\ttfamily FOLE} Database'', defining the concept of a relational database, represent the \emph{interpretation form}, expanding on material found in the paper ``Database Semantics''. Although the classification form follows the entity-relationship-attribute data model of Chen, the interpretation form incorporates the relational data model of Codd. A fifth paper ``{\ttfamily FOLE} Equivalence'' proves that the classification form is equivalent to the interpretation form. In general, the {\ttfamily FOLE} representation uses a conceptual structures approach, that is completely compatible with the theory of institutions, formal concept analysis and information flow.
An Ecosystem for Personal Knowledge Graphs: A Survey and Research Roadmap
Skjæveland, Martin G., Balog, Krisztian, Bernard, Nolwenn, Lajewska, Weronika, Linjordet, Trond
This paper presents an ecosystem for personal knowledge graphs (PKG), commonly defined as resources of structured information about entities related to an individual, their attributes, and the relations between them. PKGs are a key enabler of secure and sophisticated personal data management and personalized services. However, there are challenges that need to be addressed before PKGs can achieve widespread adoption. One of the fundamental challenges is the very definition of what constitutes a PKG, as there are multiple interpretations of the term. We propose our own definition of a PKG, emphasizing the aspects of (1) data ownership by a single individual and (2) the delivery of personalized services as the primary purpose. We further argue that a holistic view of PKGs is needed to unlock their full potential, and propose a unified framework for PKGs, where the PKG is a part of a larger ecosystem with clear interfaces towards data services and data sources. A comprehensive survey and synthesis of existing work is conducted, with a mapping of the surveyed work into the proposed unified ecosystem. Finally, we identify open challenges and research opportunities for the ecosystem as a whole, as well as for the specific aspects of PKGs, which include population, representation and management, and utilization.
Ontology for Healthcare Artificial Intelligence Privacy in Brazil
Vaz, Tiago Andres, Dora, José Miguel Silva, Lamb, Luís da Cunha, Camey, Suzi Alves
Using the terminology defined by current legislation, the article outlines a systematic approach to handling hospital data anonymously in preparation for its use in Artificial Intelligence (AI) applications in healthcare. The development process consisted of 7 pragmatic steps, including defining scope, selecting knowledge, reviewing important terms, constructing classes that describe designs used in epidemiological studies, machine learning paradigms, types of data and attributes, risks that anonymized data may be exposed to, privacy attacks, techniques to mitigate re-identification, privacy models, and metrics for measuring the effects of anonymization. The article concludes by demonstrating the practical implementation of this ontology in hospital settings for the development and validation of AI.
Covidia: COVID-19 Interdisciplinary Academic Knowledge Graph
Deng, Cheng, Ding, Jiaxin, Fu, Luoyi, Zhang, Weinan, Wang, Xinbing, Zhou, Chenghu
The pandemic of COVID-19 has inspired extensive works across different research fields. Existing literature and knowledge platforms on COVID-19 only focus on collecting papers on biology and medicine, neglecting the interdisciplinary efforts, which hurdles knowledge sharing and research collaborations between fields to address the problem. Studying interdisciplinary researches requires effective paper category classification and efficient cross-domain knowledge extraction and integration. In this work, we propose Covidia, COVID-19 interdisciplinary academic knowledge graph to bridge the gap between knowledge of COVID-19 on different domains. We design frameworks based on contrastive learning for disciplinary classification, and propose a new academic knowledge graph scheme for entity extraction, relation classification and ontology management in accordance with interdisciplinary researches. Based on Covidia, we also establish knowledge discovery benchmarks for finding COVID-19 research communities and predicting potential links.
Mining ℰℒ⊥ Bases with Adaptable Role Depth
Guimarães, Ricardo (Department of Informatics, University of Bergen) | Ozaki, Ana (Department of Informatics, University of Bergen) | Persia, Cosimo (Department of Informatics, University of Bergen) | Sertkaya, Baris (a:1:{s:5:"en_US";s:9:"Prof. Dr.";})
In Formal Concept Analysis, a base for a finite structure is a set of implications that characterizes all valid implications of the structure. This notion can be adapted to the context of Description Logic, where the base consists of a set of concept inclusions instead of implications. In this setting, concept expressions can be arbitrarily large. Thus, it is not clear whether a finite base exists and, if so, how large concept expressions may need to be. We first revisit results in the literature for mining ℰℒ⊥ bases from finite interpretations. Those mainly focus on finding a finite base or on fixing the role depth but potentially losing some of the valid concept inclusions with higher role depth. We then present a new strategy for mining ℰℒ⊥ bases which is adaptable in the sense that it can bound the role depth of concepts depending on the local structure of the interpretation. Our strategy guarantees to capture all ℰℒ⊥ concept inclusions holding in the interpretation, not only the ones up to a fixed role depth. We also consider the case of confident ℰℒ⊥ bases, which requires that some proportion of the domain of the interpretation satisfies the base, instead of the whole domain. This case is useful to cope with noisy data.
Using Multiple RDF Knowledge Graphs for Enriching ChatGPT Responses
Mountantonakis, Michalis, Tzitzikas, Yannis
There is a recent trend for using the novel Artificial Intelligence ChatGPT chatbox, which provides detailed responses and articulate answers across many domains of knowledge. However, in many cases it returns plausible-sounding but incorrect or inaccurate responses, whereas it does not provide evidence. Therefore, any user has to further search for checking the accuracy of the answer or/and for finding more information about the entities of the response. At the same time there is a high proliferation of RDF Knowledge Graphs (KGs) over any real domain, that offer high quality structured data. For enabling the combination of ChatGPT and RDF KGs, we present a research prototype, called GPToLODS, which is able to enrich any ChatGPT response with more information from hundreds of RDF KGs. In particular, it identifies and annotates each entity of the response with statistics and hyperlinks to LODsyndesis KG (which contains integrated data from 400 RDF KGs and over 412 million entities). In this way, it is feasible to enrich the content of entities and to perform fact checking and validation for the facts of the response at real time.
EVKG: An Interlinked and Interoperable Electric Vehicle Knowledge Graph for Smart Transportation System
Qi, Yanlin, Mai, Gengchen, Zhu, Rui, Zhang, Michael
Over the past decade, the electric vehicle industry has experienced unprecedented growth and diversification, resulting in a complex ecosystem. To effectively manage this multifaceted field, we present an EV-centric knowledge graph (EVKG) as a comprehensive, cross-domain, extensible, and open geospatial knowledge management system. The EVKG encapsulates essential EV-related knowledge, including EV adoption, electric vehicle supply equipment, and electricity transmission network, to support decision-making related to EV technology development, infrastructure planning, and policy-making by providing timely and accurate information and analysis. To enrich and contextualize the EVKG, we integrate the developed EV-relevant ontology modules from existing well-known knowledge graphs and ontologies. This integration enables interoperability with other knowledge graphs in the Linked Data Open Cloud, enhancing the EVKG's value as a knowledge hub for EV decision-making. Using six competency questions, we demonstrate how the EVKG can be used to answer various types of EV-related questions, providing critical insights into the EV ecosystem. Our EVKG provides an efficient and effective approach for managing the complex and diverse EV industry. By consolidating critical EV-related knowledge into a single, easily accessible resource, the EVKG supports decision-makers in making informed choices about EV technology development, infrastructure planning, and policy-making. As a flexible and extensible platform, the EVKG is capable of accommodating a wide range of data sources, enabling it to evolve alongside the rapidly changing EV landscape.
Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
Learning a visual concept from a small number of positive examples is a significant challenge for machine learning algorithms. Current methods typically fail to find the appropriate level of generalization in a concept hierarchy for a given set of visual examples. Recent work in cognitive science on Bayesian models of generalization addresses this challenge, but prior results assumed that objects were perfectly recognized. We present an algorithm for learning visual concepts directly from images, using probabilistic predictions generated by visual classifiers as the input to a Bayesian generalization model. As no existing challenge data tests this paradigm, we collect and make available a new, large-scale dataset for visual concept learning using the ImageNet hierarchy as the source of possible concepts, with human annotators to provide ground truth labels as to whether a new image is an instance of each concept using a paradigm similar to that used in experiments studying word learning in children.
Enhancing Cluster Quality of Numerical Datasets with Domain Ontology
Heiyanthuduwage, Sudath Rohitha, Rahman, Md Anisur, Islam, Md Zahidul
Ontology-based clustering has gained attention in recent years due to the potential benefits of ontology. Current ontology-based clustering approaches have mainly been applied to reduce the dimensionality of attributes in text document clustering. Reduction in dimensionality of attributes using ontology helps to produce high quality clusters for a dataset. However, ontology-based approaches in clustering numerical datasets have not been gained enough attention. Moreover, some literature mentions that ontology-based clustering can produce either high quality or low-quality clusters from a dataset. Therefore, in this paper we present a clustering approach that is based on domain ontology to reduce the dimensionality of attributes in a numerical dataset using domain ontology and to produce high quality clusters. For every dataset, we produce three datasets using domain ontology. We then cluster these datasets using a genetic algorithm-based clustering technique called GenClust++. The clusters of each dataset are evaluated in terms of Sum of Squared-Error (SSE). We use six numerical datasets to evaluate the performance of our ontology-based approach. The experimental results of our approach indicate that cluster quality gradually improves from lower to the higher levels of a domain ontology.
The Music Annotation Pattern
de Berardinis, Jacopo, Meroño-Peñuela, Albert, Poltronieri, Andrea, Presutti, Valentina
The annotation of music content is a complex process to represent due to its inherent multifaceted, subjectivity, and interdisciplinary nature. Numerous systems and conventions for annotating music have been developed as independent standards over the past decades. Little has been done to make them interoperable, which jeopardises cross-corpora studies as it requires users to familiarise with a multitude of conventions. Most of these systems lack the semantic expressiveness needed to represent the complexity of the musical language and cannot model multi-modal annotations originating from audio and symbolic sources. In this article, we introduce the Music Annotation Pattern, an Ontology Design Pattern (ODP) to homogenise different annotation systems and to represent several types of musical objects (e.g. chords, patterns, structures). This ODP preserves the semantics of the object's content at different levels and temporal granularity. Moreover, our ODP accounts for multi-modality upfront, to describe annotations derived from different sources, and it is the first to enable the integration of music datasets at a large scale.