Ontologies
Homa at SemEval-2025 Task 5: Aligning Librarian Records with OntoAligner for Subject Tagging
Tekanlou, Hadi Bayrami Asl, Razmara, Jafar, Sanaei, Mahsa, Rahgouy, Mostafa, Giglou, Hamed Babaei
This paper presents our system, Homa, for SemEval-2025 Task 5: Subject Tagging, which focuses on automatically assigning subject labels to technical records from TIBKAT using the Gemeinsame Normdatei (GND) taxonomy. We leverage OntoAligner, a modular ontology alignment toolkit, to address this task by integrating retrieval-augmented generation (RAG) techniques. Our approach formulates the subject tagging problem as an alignment task, where records are matched to GND categories based on semantic similarity. We evaluate OntoAligner's adaptability for subject indexing and analyze its effectiveness in handling multilingual records. Experimental results demonstrate the strengths and limitations of this method, highlighting the potential of alignment techniques for improving subject tagging in digital libraries.
How Group Lives Go Well
Beverley, John, Hurley, Regina
This paper explores the ontolog ical space of group well - being, proposing a framework for representing collective welfare, group functions, and long - term contributions within an ontology engineering context. Traditional well - being theories focus on individual states, often relying on hedonistic, desire - satisfaction, or objective list models. Such approaches struggle to account for cases where individual sacrifices contribute to broader social progress -- a critical challenge in modeling group flourishing . To address this, the paper refines and extends the Counterfactual Account (CT) of well - being, which evaluates goodness of an event by comparing an individual's actual well - being with a hypothetical counterpart in a nearby possible world. While useful, this framework is insufficient for group - level onto logies, where well - being depends on functional persistence, institutional roles, and historical impact rather than immediate individual outcomes . Drawing on Basic Formal Ontology (BFO), the paper introduces a model in which group flourishing is evaluated in terms of group functional, where members bear roles and exhibit persistence conditions akin to biological systems or designed artifacts. This approach enables semantic interoperability for modeling longitudinal social contributions, allowing for structured reasoning about group welfare, social institutions, and group flourishing over time.
GLaMoR: Consistency Checking of OWL Ontologies using Graph Language Models
Mรผcke, Justin, Scherp, Ansgar
--Semantic reasoning aims to infer new knowledge from existing knowledge, with OWL ontologies serving as a standardized framework for organizing information. A key challenge in semantic reasoning is verifying ontology consistency. However, state-of-the-art reasoners are computationally expensive, and their efficiency decreases as ontology sizes grow. While classical machine learning models have been explored for consistency checking, they struggle to capture complex relationships within ontologies. Large language models (LLMs) have shown promising results for simple reasoning tasks but perform poorly on structured reasoning. The recently introduced Graph Language Model (GLM) offers a way to simultaneously process graph-structured data and text. This paper proposes GLaMoR (Graph Language Model for Reasoning), a reasoning pipeline that transforms OWL ontologies into graph-structured data and adapts the GLM architecture for consistency checking. We evaluate GLaMoR on ontologies from the NCBO BioPortal repository, converting them into triples suitable for model input. Our results show that the GLM outperforms all baseline models, achieving 95% accuracy while being 20 times faster than classical reasoners. With the increasing complexity of knowledge representation and reasoning systems, ontologies play a vital role in structuring domain knowledge across various fields, e. g., biomedical expert knowledge. OWL provides a stable foundation for diverse tasks based on ontologies. OWL 2 [1] is based on the SROIQ [2] description logic, which supports complex reasoning while maintaining logical consistency. To derive additional knowledge from these ontologies, semantic reasoners are employed to infer new facts through logical entailment. These reasoners are critical in supporting key tasks such as classification, query answering, and consistency checking by leveraging formal logic systems for precise and reliable inference. A prominent example is HermiT [3], an OWL 2-compliant reasoner that uses hyper-tableau calculus to perform reasoning tasks efficiently.
Graph2Nav: 3D Object-Relation Graph Generation to Robot Navigation
Shan, Tixiao, Rajvanshi, Abhinav, Mithun, Niluthpol, Chiu, Han-Pang
-- We propose Graph2Nav, a real-time 3D object-relation graph generation framework, for autonomous navigation in the real world. Our framework fully generates and exploits both 3D objects and a rich set of semantic relationships among objects in a 3D layered scene graph, which is applicable to both indoor and outdoor scenes. It learns to generate 3D semantic relations among objects, by leveraging and advancing state-of-the-art 2D panoptic scene graph works into the 3D world via 3D semantic mapping techniques. This approach avoids previous training data constraints in learning 3D scene graphs directly from 3D data. We conduct experiments to validate the accuracy in locating 3D objects and labeling object-relations in our 3D scene graphs. We also evaluate the impact of Graph2Nav via integration with SayNav, a state-of-the-art planner based on large language models, on an unmanned ground robot to object search tasks in real environments. Our results demonstrate that modeling object relations in our scene graphs improves search efficiency in these navigation tasks. The main advantage of a 3D scene graph over other object-based 3D scene representations is its capability also to represent semantic relationships (e.g. These relationships are useful to many downstream applications, such as scene manipulation [13], [14] and task planning [12]. Leveraging 3D scene graphs to robot navigation has also emerged as a promising research field with impressive performance [15]-[17].
Context-Awareness and Interpretability of Rare Occurrences for Discovery and Formalization of Critical Failure Modes
Polavaram, Sridevi, Zhou, Xin, Ravi, Meenu, Zarei, Mohammad, Srivastava, Anmol
--Vision systems are increasingly deployed in critical domains such as surveillance, law enforcement, and transportation. However, their vulnerabilities to rare or unforeseen scenarios pose significant safety risks. T o address these challenges, we introduce Context-A wareness and Interpretability of Rare Occurrences (CAIRO), an ontology-based human-assistive discovery framework for failure cases (or CP - Critical Phenomena) detection and formalization. CAIRO by design incentivizes human-in-the-loop for testing and evaluation of criticality that arises from misdetections, adversarial attacks, and hallucinations in AI black-box models. Our robust analysis of object detection model(s) failures in automated driving systems (ADS) showcases scalable and interpretable ways of formalizing the observed gaps between camera perception and real-world contexts, resulting in test cases stored as explicit knowledge graphs (in OWL/XML format) amenable for sharing, downstream analysis, logical reasoning, and accountability. I NTRODUCTION Formal verification techniques are a norm in chip design, but they remain elusive in computer vision (CV) applications. The reason being CV applications are deemed open-ended, often trained on millions of data and billions of parameters to learn a few hundreds of labels. Finetuning practices are commonly used to tailor them to specific needs, but with no standard testing procedures in place providing guidance for their application to ensure fail-safe behaviors, critical systems like Autonomous V ehicles (A V) are bound to fail [1].
A Phenomenological Approach to Analyzing User Queries in IT Systems Using Heidegger's Fundamental Ontology
This paper presents a novel research analytical IT system grounded in Martin Heidegger's Fundamental Ontology, distinguishing between beings (das Seiende) and Being (das Sein). The system employs two modally distinct, descriptively complete languages: a categorical language of beings for processing user inputs and an existential language of Being for internal analysis. These languages are bridged via a phenomenological reduction module, enabling the system to analyze user queries (including questions, answers, and dialogues among IT specialists), identify recursive and self-referential structures, and provide actionable insights in categorical terms. Unlike contemporary systems limited to categorical analysis, this approach leverages Heidegger's phenomenological existential analysis to uncover deeper ontological patterns in query processing, aiding in resolving logical traps in complex interactions, such as metaphor usage in IT contexts. The path to full realization involves formalizing the language of Being by a research team based on Heidegger's Fundamental Ontology; given the existing completeness of the language of beings, this reduces the system's computability to completeness, paving the way for a universal query analysis tool. The paper presents the system's architecture, operational principles, technical implementation, use cases--including a case based on real IT specialist dialogues--comparative evaluation with existing tools, and its advantages and limitations.
Inversion of biological strategies in engineering technology: in case underwater soft robot
Chen, Siqing, Xua, He, Zhang, Xueyu, Ma, Zhen
This paper proposes a biomimetic design framework based on biological strategy inversion, aiming to systematically map solutions evolved in nature to the engineering field. Using underwater soft robot design as a case study, the effectiveness of the framework in optimizing drive mechanisms, power distribution, and motion pattern design is verified. This research provides scalable methodological support for interdisciplinary biomimetic innovation. Keywords: Bionic design; Biological strategy inversion; Knowledge framework; Soft robot 1. Introduction The core process of biomimetic inspired design can be divided into four progressive stages: problem definition, biological prototype screening, principle extraction, and engineering technology transformation[1]. This paradigm is essentially a cross-domain knowledge reconstruction process, utilizing existing biological characteristics, behaviors, and functions to correspond to features, behaviors, and similar functions in engineering, with the key being the efficiency of knowledge mapping between biological systems and engineering systems[2]. The cognitive bottleneck in current research areas lies in the fact that the high complexity of biological systems often makes it difficult to pinpoint key strategic information, while the existing knowledge framework of engineering systems struggles to effectively integrate with biological strategy knowledge. Corresponding author Email address: railway_dragon@sohu.com (He Xu) URL: (Siqing Chen), (Xueyu Zhang), (Zhen Ma) Preprint submitted to Journal of L Researchers with a biological background can explain the operational rules of natural systems well but lack knowledge reserves for engineering problems[4]. Engineers working in this field commonly encounter systemic barriers in identifying biological strategies, constrained by the professional barriers of the biological terminology system and the technical limitations of interdisciplinary knowledge expression[4][3]. Therefore, constructing an intelligent matching mechanism between biological characteristics and engineering parameters, and improving the technical processes for screening biological prototypes and converting engineering technologies, are important research directions for enhancing the effectiveness of biomimetic design.
Language and Knowledge Representation: A Stratified Approach
It can have serious implications in critical application scenarios like that of Knowledge Graph-based multilingual data integration. In view of the above, the thesis argues that the current understanding of the problem of semantic heterogeneity as the'existence of variance', while being crucially necessary, is not sufficient and under-characterized. There can be no variance without a prior notion of a unifying reference taken as the basis for computing the variance itself. To that end, the thesis proposes the problem of representation heterogeneity to emphasize the fact that heterogeneity is an intrinsic property of any representation, wherein, different observers encode different representations of the same target reality in a stratified manner using different concepts, language and knowledge (as well as data). The thesis then advances a top-down solution approach to the above stratified problem of representation heterogeneity in terms of several solution components, namely: (i) a representation formalism stratified into concept level, language level, knowledge level and data level to accommodate representation heterogeneity, (ii) a top-down language representation using Universal Knowledge Core (UKC), UKC namespaces and domain languages to tackle the conceptual and language level heterogeneity, (iii) a top-down knowledge representation using the notions of language teleontology and knowledge teleontology to tackle the knowledge level heterogeneity, (iv) the usage and further development of the existing LiveKnowledge catalog for enforcing iterative reuse and sharing of language and knowledge representations, and, (v) the kTelos methodology integrating the solution components above to iteratively generate the language and knowledge representations absolving representation heterogeneity. The thesis also includes proof-of-concepts of the language and knowledge representations developed for two international research projects - DataScientia (data catalogs) and JIDEP (materials modelling). Finally, the thesis concludes with future lines of research.
From Conceptual Data Models to Multimodal Representation
1) Introduction and Conceptual Framework: This document explores the concept of information design by dividing it into two major practices: defining the meaning of a corpus of textual data and its visual or multimodal representation. It draws on expertise in enriching textual corpora, particularly audiovisual ones, and transforming them into multiple narrative formats. The text highlights a crucial distinction between the semantic content of a domain and the modalities of its graphic expression, illustrating this approach with concepts rooted in structural semiotics and linguistics traditions. 2) Modeling and Conceptual Design: The article emphasizes the importance of semantic modeling, often achieved through conceptual networks or graphs. These tools enable the structuring of knowledge within a domain by accounting for relationships between concepts, contexts of use, and specific objectives. Stockinger also highlights the constraints and challenges involved in creating dynamic and adaptable models, integrating elements such as thesauri or interoperable ontologies to facilitate the analysis and publication of complex corpora. 3) Applications and Multimodal Visualization: The text concludes by examining the practical application of these models in work environments like OKAPI, developed to analyze, publish, and reuse audiovisual data. It also discusses innovative approaches such as visual storytelling and document reengineering, which involve transforming existing content into new resources tailored to various contexts. These methods emphasize interoperability, flexibility, and the intelligence of communication systems, paving the way for richer and more collaborative use of digital data. The content of this document was presented during the "Semiotics of Information Design" Day organized by Anne Beyaert-Geslin of the University of Bordeaux Montaigne (MICA laboratory) on June 21, 2018, in Bordeaux.
Reduction of Supervision for Biomedical Knowledge Discovery
Theodoropoulos, Christos, Coman, Andrei Catalin, Henderson, James, Moens, Marie-Francine
Knowledge discovery is hindered by the increasing volume of publications and the scarcity of extensive annotated data. To tackle the challenge of information overload, it is essential to employ automated methods for knowledge extraction and processing. Finding the right balance between the level of supervision and the effectiveness of models poses a significant challenge. While supervised techniques generally result in better performance, they have the major drawback of demanding labeled data. This requirement is labor-intensive and time-consuming and hinders scalability when exploring new domains. In this context, our study addresses the challenge of identifying semantic relationships between biomedical entities (e.g., diseases, proteins) in unstructured text while minimizing dependency on supervision. We introduce a suite of unsupervised algorithms based on dependency trees and attention mechanisms and employ a range of pointwise binary classification methods. Transitioning from weakly supervised to fully unsupervised settings, we assess the methods' ability to learn from data with noisy labels. The evaluation on biomedical benchmark datasets explores the effectiveness of the methods. Our approach tackles a central issue in knowledge discovery: balancing performance with minimal supervision. By gradually decreasing supervision, we assess the robustness of pointwise binary classification techniques in handling noisy labels, revealing their capability to shift from weakly supervised to entirely unsupervised scenarios. Comprehensive benchmarking offers insights into the effectiveness of these techniques, suggesting an encouraging direction toward adaptable knowledge discovery systems, representing progress in creating data-efficient methodologies for extracting useful insights when annotated data is limited.