AITopics | Ontologies

Collaborating Authors

Ontologies

"An ontology defines the terms used to describe and represent an area of knowledge. … Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them."
– from OWL Web Ontology Language Use Cases and Requirements. W3C Recommendation (10 February 2004). Jeff Heflin, editor.

News Overviews Instructional Materials AI-Alerts Classics

Spider4SPARQL: A Complex Benchmark for Evaluating Knowledge Graph Question Answering Systems

Kosten, Catherine, Cudré-Mauroux, Philippe, Stockinger, Kurt

arXiv.org Artificial IntelligenceDec-8-2023

With the recent spike in the number and availability of Large Language Models (LLMs), it has become increasingly important to provide large and realistic benchmarks for evaluating Knowledge Graph Question Answering (KGQA) systems. So far the majority of benchmarks rely on pattern-based SPARQL query generation approaches. The subsequent natural language (NL) question generation is conducted through crowdsourcing or other automated methods, such as rule-based paraphrasing or NL question templates. Although some of these datasets are of considerable size, their pitfall lies in their pattern-based generation approaches, which do not always generalize well to the vague and linguistically diverse questions asked by humans in real-world contexts. In this paper, we introduce Spider4SPARQL - a new SPARQL benchmark dataset featuring 9,693 previously existing manually generated NL questions and 4,721 unique, novel, and complex SPARQL queries of varying complexity. In addition to the NL/SPARQL pairs, we also provide their corresponding 166 knowledge graphs and ontologies, which cover 138 different domains. Our complex benchmark enables novel ways of evaluating the strengths and weaknesses of modern KGQA systems. We evaluate the system with state-of-the-art KGQA systems as well as LLMs, which achieve only up to 45\% execution accuracy, demonstrating that Spider4SPARQL is a challenging benchmark for future research.

dataset, knowledge graph, query, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/BigData59044.2023.10386182

2309.16248

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Switzerland > Fribourg > Fribourg (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

HALO: An Ontology for Representing Hallucinations in Generative Models

Nananukul, Navapat, Kejriwal, Mayank

arXiv.org Artificial IntelligenceDec-8-2023

Recent progress in generative AI, including large language models (LLMs) like ChatGPT, has opened up significant opportunities in fields ranging from natural language processing to knowledge discovery and data mining. However, there is also a growing awareness that the models can be prone to problems such as making information up or `hallucinations', and faulty reasoning on seemingly simple problems. Because of the popularity of models like ChatGPT, both academic scholars and citizen scientists have documented hallucinations of several different types and severity. Despite this body of work, a formal model for describing and representing these hallucinations (with relevant meta-data) at a fine-grained level, is still lacking. In this paper, we address this gap by presenting the Hallucination Ontology or HALO, a formal, extensible ontology written in OWL that currently offers support for six different types of hallucinations known to arise in LLMs, along with support for provenance and experimental metadata. We also collect and publish a dataset containing hallucinations that we inductively gathered across multiple independent Web sources, and show that HALO can be successfully used to model this dataset and answer competency questions.

hallucination, halo, ontology, (13 more...)

arXiv.org Artificial Intelligence

2312.05209

Country:

North America > United States > New Jersey (0.04)
North America > United States > California > Monterey County > Marina (0.04)
Europe > Greece > Attica > Athens (0.04)

Genre: Research Report (1.00)

Industry: Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Scalable Knowledge Graph Construction and Inference on Human Genome Variants

Prasanna, Shivika, Rao, Deepthi, Simoes, Eduardo, Rao, Praveen

arXiv.org Artificial IntelligenceDec-7-2023

Real-world knowledge can be represented as a graph consisting of entities and relationships between the entities. The need for efficient and scalable solutions arises when dealing with vast genomic data, like RNA-sequencing. Knowledge graphs offer a powerful approach for various tasks in such large-scale genomic data, such as analysis and inference. In this work, variant-level information extracted from the RNA-sequences of vaccine-na\"ive COVID-19 patients have been represented as a unified, large knowledge graph. Variant call format (VCF) files containing the variant-level information were annotated to include further information for each variant. The data records in the annotated files were then converted to Resource Description Framework (RDF) triples. Each VCF file obtained had an associated CADD scores file that contained the raw and Phred-scaled scores for each variant. An ontology was defined for the VCF and CADD scores files. Using this ontology and the extracted information, a large, scalable knowledge graph was created. Available graph storage was then leveraged to query and create datasets for further downstream tasks. We also present a case study using the knowledge graph and perform a classification task using graph machine learning. We also draw comparisons between different Graph Neural Networks (GNNs) for the case study.

knowledge graph, node, variant, (15 more...)

arXiv.org Artificial Intelligence

2312.04423

Country:

North America > United States > Missouri > Boone County > Columbia (0.05)
North America > United States > Wisconsin (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.88)
Health & Medicine > Therapeutic Area > Immunology (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

RDF Stream Taxonomy: Systematizing RDF Stream Types in Research and Practice

Sowinski, Piotr, Szmeja, Pawel, Ganzha, Maria, Paprzycki, Marcin

arXiv.org Artificial IntelligenceDec-7-2023

Over the years, RDF streaming was explored in research and practice from many angles, resulting in a wide range of RDF stream definitions. This variety presents a major challenge in discussing and integrating streaming solutions, due to the lack of a common language. This work attempts to address this critical research gap, by systematizing RDF stream types present in the literature in a novel taxonomy. The proposed RDF Stream Taxonomy (RDF-STaX) is embodied in an OWL 2 DL ontology that follows the FAIR principles, making it readily applicable in practice. Extensive documentation and additional resources are provided, to foster the adoption of the ontology. Two realized use cases are presented, demonstrating the usefulness of the resource in discussing research works and annotating streaming datasets. Another result of this contribution is the novel nanopublications dataset, which serves as a collaborative, living state-of-the-art review of RDF streaming. The aim of RDF-STaX is to address a real need of the community for a better way to systematize and describe RDF streams. The resource is designed to help drive innovation in RDF streaming, by fostering scientific discussion, cooperation, and tool interoperability.

dataset, graph, rdf stream, (14 more...)

arXiv.org Artificial Intelligence

2311.1454

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > France > Occitanie > Hérault > Montpellier (0.04)
Oceania > Palau (0.04)
(6 more...)

Genre:

Overview (0.68)
Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Sem@$K$: Is my knowledge graph embedding model semantic-aware?

Hubert, Nicolas, Monnin, Pierre, Brun, Armelle, Monticolo, Davy

arXiv.org Artificial IntelligenceDec-7-2023

Using knowledge graph embedding models (KGEMs) is a popular approach for predicting links in knowledge graphs (KGs). Traditionally, the performance of KGEMs for link prediction is assessed using rank-based metrics, which evaluate their ability to give high scores to ground-truth entities. However, the literature claims that the KGEM evaluation procedure would benefit from adding supplementary dimensions to assess. That is why, in this paper, we extend our previously introduced metric Sem@K that measures the capability of models to predict valid entities w.r.t. domain and range constraints. In particular, we consider a broad range of KGs and take their respective characteristics into account to propose different versions of Sem@K. We also perform an extensive study to qualify the abilities of KGEMs as measured by our metric. Our experiments show that Sem@K provides a new perspective on KGEM quality. Its joint analysis with rank-based metrics offers different conclusions on the predictive power of models. Regarding Sem@K, some KGEMs are inherently better than others, but this semantic superiority is not indicative of their performance w.r.t. rank-based metrics. In this work, we generalize conclusions about the relative performance of KGEMs w.r.t. rank-based and semantic-oriented metrics at the level of families of models. The joint analysis of the aforementioned metrics gives more insight into the peculiarities of each model. This work paves the way for a more comprehensive evaluation of KGEM adequacy for specific downstream tasks.

knowledge graph, relation, sem, (12 more...)

arXiv.org Artificial Intelligence

2301.05601

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Africa > Mozambique > Maputo City > Maputo (0.04)
North America > United States > New York > New York County > New York City (0.04)
(39 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.69)

Add feedback

Towards Ordinal Data Science

Stumme, Gerd, Dürrschnabel, Dominik, Hanika, Tom

arXiv.org Artificial IntelligenceDec-6-2023

Order is one of the main instruments to measure the relationship between objects in (empirical) data. However, compared to methods that use numerical properties of objects, the amount of ordinal methods developed is rather small. One reason for this is the limited availability of computational resources in the last century that would have been required for ordinal computations. Another reason -- particularly important for this line of research -- is that order-based methods are often seen as too mathematically rigorous for applying them to real-world data. In this paper, we will therefore discuss different means for measuring and 'calculating' with ordinal structures -- a specific class of directed graphs -- and show how to infer knowledge from them. Our aim is to establish Ordinal Data Science as a fundamentally new research agenda. Besides cross-fertilization with other cornerstone machine learning and knowledge representation methods, a broad range of disciplines will benefit from this endeavor, including, psychology, sociology, economics, web science, knowledge engineering, scientometrics.

lattice, relation, stumme, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.4230/TGDK.1.1.6

2307.09477

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Hungary > Budapest > Budapest (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
(35 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Information Technology > Services (1.00)
(5 more...)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(5 more...)

Add feedback

Large Knowledge Model: Perspectives and Challenges

Chen, Huajun

arXiv.org Artificial IntelligenceDec-5-2023

Humankind's understanding of the world is fundamentally linked to our perception and cognition, with \emph{human languages} serving as one of the major carriers of \emph{world knowledge}. In this vein, \emph{Large Language Models} (LLMs) like ChatGPT epitomize the pre-training of extensive, sequence-based world knowledge into neural networks, facilitating the processing and manipulation of this knowledge in a parametric space. This article explores large models through the lens of ``knowledge''. We initially investigate the role of symbolic knowledge such as Knowledge Graphs (KGs) in enhancing LLMs, covering aspects like knowledge-augmented language model, structure-inducing pre-training, knowledgeable prompts, structured CoT, knowledge editing, semantic tools for LLM and knowledgeable AI agents. Subsequently, we examine how LLMs can amplify traditional symbolic knowledge bases, encompassing aspects like using LLM as KG builder and controller, structured knowledge pretraining, LLM-enhanced symbolic reasoning, and the amalgamation of perception with cognition. Considering the intricate nature of human knowledge, we advocate for the creation of \emph{Large Knowledge Models} (LKM), specifically engineered to manage diversified spectrum of knowledge structures. This ambitious undertaking could entail several key challenges, such as disentangling knowledge representation from language models, restructuring pre-training with structured knowledge, and building large commonsense models, among others. We finally propose a five-``A'' principle to distinguish the concept of LKM.

knowledge, language model, reasoning, (16 more...)

arXiv.org Artificial Intelligence

2312.02706

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(2 more...)

Add feedback

Plug-and-Play Knowledge Injection for Pre-trained Language Models

Zhang, Zhengyan, Zeng, Zhiyuan, Lin, Yankai, Wang, Huadong, Ye, Deming, Xiao, Chaojun, Han, Xu, Liu, Zhiyuan, Li, Peng, Sun, Maosong, Zhou, Jie

arXiv.org Artificial IntelligenceDec-4-2023

Injecting external knowledge can improve the performance of pre-trained language models (PLMs) on various downstream NLP tasks. However, massive retraining is required to deploy new knowledge injection methods or knowledge bases for downstream tasks. In this work, we are the first to study how to improve the flexibility and efficiency of knowledge injection by reusing existing downstream models. To this end, we explore a new paradigm plug-and-play knowledge injection, where knowledge bases are injected into frozen existing downstream models by a knowledge plugin. Correspondingly, we propose a plug-and-play injection method map-tuning, which trains a mapping of knowledge embeddings to enrich model inputs with mapped embeddings while keeping model parameters frozen. Experimental results on three knowledge-driven NLP tasks show that existing injection methods are not suitable for the new paradigm, while map-tuning effectively improves the performance of downstream models. Moreover, we show that a frozen downstream model can be well adapted to different domains with different mapping networks of domain knowledge. Our code and models are available at https://github.com/THUNLP/Knowledge-Plugin.

knowledge, latexit sha1, mapping network, (16 more...)

arXiv.org Artificial Intelligence

2305.17691

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.46)

Add feedback

Matching Weak Informative Ontologies

Wang, Peng

arXiv.org Artificial IntelligenceNov-30-2023

Most existing ontology matching methods utilize the literal information to discover alignments. However, some literal information in ontologies may be opaque and some ontologies may not have sufficient literal information. In this paper, these ontologies are named as weak informative ontologies (WIOs) and it is challenging for existing methods to matching WIOs. On one hand, string-based and linguistic-based matching methods cannot work well for WIOs. On the other hand, some matching methods use external resources to improve their performance, but collecting and processing external resources is still time-consuming. To address this issue, this paper proposes a practical method for matching WIOs by employing the ontology structure information to discover alignments. First, the semantic subgraphs are extracted from the ontology graph to capture the precise meanings of ontology elements. Then, a new similarity propagation model is designed for matching WIOs. Meanwhile, in order to avoid meaningless propagation, the similarity propagation is constrained by semantic subgraphs and other conditions. Consequently, the similarity propagation model ensures a balance between efficiency and quality during matching. Finally, the similarity propagation model uses a few credible alignments as seeds to find more alignments, and some useful strategies are adopted to improve the performance. This matching method for WIOs has been implemented in the ontology matching system Lily. Experimental results on public OAEI benchmark datasets demonstrate that Lily significantly outperforms most of the state-of-the-art works in both WIO matching tasks and general ontology matching tasks. In particular, Lily increases the recall by a large margin, while it still obtains high precision of matching results.

graph, ontology, semantic subgraph, (13 more...)

arXiv.org Artificial Intelligence

2312.00332

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > Middle East > Jordan (0.04)
(19 more...)

Genre:

Research Report (0.81)
Overview (0.67)

Industry: Energy > Power Industry (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Agent-OM: Leveraging Large Language Models for Ontology Matching

Qiang, Zhangcheng, Wang, Weiqing, Taylor, Kerry

arXiv.org Artificial IntelligenceNov-30-2023

Ontology matching (OM) enables semantic interoperability between different ontologies and resolves their conceptual heterogeneity by aligning related entities. OM systems currently have two prevailing design paradigms: conventional knowledge-based expert systems and newer machine learning-based predictive systems. While large language models (LLMs) and LLM-based agents have become revolutionary in data engineering and have been applied creatively in various domains, their potential for OM remains underexplored. This study introduces a novel agent-powered LLM-based design paradigm for OM systems. With thoughtful consideration of several specific challenges to leverage LLMs for OM, we propose a generic framework, namely Agent-OM, consisting of two Siamese agents for retrieval and matching, with a set of simple prompt-based OM tools. Our framework is implemented in a proof-of-concept system. Evaluations of three Ontology Alignment Evaluation Initiative (OAEI) tracks over state-of-the-art OM systems show that our system can achieve very close results to the best long-standing performance on simple OM tasks and significantly improve the performance on complex and few-shot OM tasks.

information, ontology, subjectarea, (16 more...)

arXiv.org Artificial Intelligence

2312.00326

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback