Semantic Networks
SMART: Relation-Aware Learning of Geometric Representations for Knowledge Graphs
Amouzouvi, Kossi, Song, Bowen, Coletta, Andrea, Bellomarini, Luigi, Lehmann, Jens, Vahdati, Sahar
Knowledge graph representation learning approaches provide a mapping between symbolic knowledge in the form of triples in a knowledge graph (KG) and their feature vectors. Knowledge graph embedding (KGE) models often represent relations in a KG as geometric transformations. Most state-of-the-art (SOTA) KGE models are derived from elementary geometric transformations (EGTs), such as translation, scaling, rotation, and reflection, or their combinations. These geometric transformations enable the models to effectively preserve specific structural and relational patterns of the KG. However, the current use of EGTs by KGEs remains insufficient without considering relation-specific transformations. Although recent models attempted to address this problem by ensembling SOTA baseline models in different ways, only a single or composite version of geometric transformations are used by such baselines to represent all the relations. In this paper, we propose a framework that evaluates how well each relation fits with different geometric transformations. Based on this ranking, the model can: (1) assign the best-matching transformation to each relation, or (2) use majority voting to choose one transformation type to apply across all relations. That is, the model learns a single relation-specific EGT in low dimensional vector space through an attention mechanism. Furthermore, we use the correlation between relations and EGTs, which are learned in a low dimension, for relation embeddings in a high dimensional vector space. The effectiveness of our models is demonstrated through comprehensive evaluations on three benchmark KGs as well as a real-world financial KG, witnessing a performance comparable to leading models
KGRAG-Ex: Explainable Retrieval-Augmented Generation with Knowledge Graph-based Perturbations
Balanos, Georgios, Chasanis, Evangelos, Skianis, Konstantinos, Pitoura, Evaggelia
Retrieval-Augmented Generation (RAG) enhances language models by grounding responses in external information, yet explainability remains a critical challenge, particularly when retrieval relies on unstructured text. Knowledge graphs (KGs) offer a solution by introducing structured, semantically rich representations of entities and their relationships, enabling transparent retrieval paths and interpretable reasoning. In this work, we present KGRAG-Ex, a RAG system that improves both factual grounding and explainability by leveraging a domain-specific KG constructed via prompt-based information extraction. Given a user query, KGRAG-Ex identifies relevant entities and semantic paths in the graph, which are then transformed into pseudo-paragraphs: natural language representations of graph substructures that guide corpus retrieval. To improve interpretability and support reasoning transparency, we incorporate perturbation-based explanation methods that assess the influence of specific KG-derived components on the generated answers. We conduct a series of experiments to analyze the sensitivity of the system to different perturbation methods, the relationship between graph component importance and their structural positions, the influence of semantic node types, and how graph metrics correspond to the influence of components within the explanations process.
Context Pooling: Query-specific Graph Pooling for Generic Inductive Link Prediction in Knowledge Graphs
Su, Zhixiang, Wang, Di, Miao, Chunyan
Recent investigations on the effectiveness of Graph Neural Network (GNN)-based models for link prediction in Knowledge Graphs (KGs) show that vanilla aggregation does not significantly impact the model performance. In this paper, we introduce a novel method, named Context Pooling, to enhance GNN-based models' efficacy for link predictions in KGs. To our best of knowledge, Context Pooling is the first methodology that applies graph pooling in KGs. Additionally, Context Pooling is first-of-its-kind to enable the generation of query-specific graphs for inductive settings, where testing entities are unseen during training. Specifically, we devise two metrics, namely neighborhood precision and neighborhood recall, to assess the neighbors' logical relevance regarding the given queries, thereby enabling the subsequent comprehensive identification of only the logically relevant neighbors for link prediction. Our method is generic and assessed by being applied to two state-of-the-art (SOTA) models on three public transductive and inductive datasets, achieving SOTA performance in 42 out of 48 settings.
NDAI-NeuroMAP: A Neuroscience-Specific Embedding Model for Domain-Specific Retrieval
Patel, Devendra, Jain, Aaditya, Verma, Jayant, Rajput, Divyansh, Mahala, Sunil, Khapare, Ketki Suresh, Kalla, Jayateja
The exponential growth in neuroscience research output and clinical data necessitates the development of specialized natural language processing models tailored to this domain. Contemporary embedding models, while demonstrating superior performance on general-purpose benchmarks, exhibit suboptimal efficacy when applied to neuroscience-specific tasks due to their broad training objectives and limited exposure to domain-specific terminologies and conceptual relationships. This limitation significantly constrains the development of advanced applications including patient-centric retrieval-augmented generation (RAG) systems and comprehensive electronic health record (EHR) mining for neurological healthcare applications. To address this critical gap, we present NDAI-NeuroMAP, the first neuroscience-domain-specific dense vector embedding model engineered for high-precision information retrieval tasks. Our methodology encompasses the curation of an extensive domain-specific training corpus comprising 500,000 carefully constructed triplets (query-positive-negative configurations), augmented with 250,000 neuroscience-specific definitional entries and 250,000 structured knowledge-graph triplets derived from authoritative neurological ontologies. We employ a sophisticated fine-tuning approach utilizing the FremyCompany/BioLORD-2023 foundation model, implementing a multi-objective optimization framework combining contrastive learning with triplet-based metric learning paradigms. Comprehensive evaluation on a held-out test dataset comprising approximately 24,000 neuroscience-specific queries demonstrates substantial performance improvements over state-of-the-art general-purpose and biomedical embedding models. These empirical findings underscore the critical importance of domain-specific embedding architectures for neuroscience-oriented RAG systems and related clinical natural language processing applications. The landscape of natural language processing (NLP) has evolved profoundly over the past decade, driven by advances in neural embedding architectures. These models, which transform text into dense, high-dimensional vectors, now support diverse tasks spanning cross-lingual translation to large-scale information retrieval. Early methods, such as the seminal Word2V ec [1] and GloV e [2], introduced static word embeddings that successfully captured semantic relationships through distributional statistics, but failed to account for context, producing identical vectors for terms like "bank" regardless of meaning. Contextualized embedding architectures subsequently overcame these limitations.
T3DM: Test-Time Training-Guided Distribution Shift Modelling for Temporal Knowledge Graph Reasoning
Si, Yuehang, Zeng, Zefan, Huang, Jincai, Cheng, Qing
Temporal Knowledge Graph (TKG) is an efficient method for describing the dynamic development of facts along a timeline. Most research on TKG reasoning (TKGR) focuses on modelling the repetition of global facts and designing patterns of local historical facts. However, they face two significant challenges: inadequate modeling of the event distribution shift between training and test samples, and reliance on random entity substitution for generating negative samples, which often results in low-quality sampling. To this end, we propose a novel distributional feature modeling approach for training TKGR models, Test-Time Training-guided Distribution shift Modelling (T3DM), to adjust the model based on distribution shift and ensure the global consistency of model reasoning. In addition, we design a negative-sampling strategy to generate higher-quality negative quadruples based on adversarial training. Extensive experiments show that T3DM provides better and more robust results than the state-of-the-art baselines in most cases.
Two-dimensional Taxonomy for N-ary Knowledge Representation Learning Methods
Lu, Xiaohua, Tupikina, Liubov, Alam, Mehwish
Real-world knowledge can take various forms, including structured, semi-structured, and unstructured data. Among these, knowledge graphs are a form of structured human knowledge that integrate heterogeneous data sources into structured representations but typically reduce complex n-ary relations to simple triples, thereby losing higher-order relational details. In contrast, hypergraphs naturally represent n-ary relations with hyperedges, which directly connect multiple entities together. Yet hypergraph representation learning often overlooks entity roles in hyperedges, limiting the finegrained semantic modelling. To address these issues, knowledge hypergraphs and hyper-relational knowledge graphs combine the advantages of knowledge graphs and hypergraphs to better capture the complex structures and role-specific semantics of real world knowledge. This survey provides a comprehensive review of methods handling n-ary relational data, covering both knowledge hypergraphs and hyper-relational knowledge graphs literatures. We propose a two-dimensional taxonomy: the first dimension categorises models based on their methodology, i.e., translation-based models, tensor factorisation-based models, deep neural network-based models, logic rules-based models, and hyperedge expansion-based models. The second dimension classifies models according to their awareness of entity roles and positions in n-ary relations, dividing them into aware-less, position-aware, and role-aware approaches. Finally, we discuss existing datasets, training settings and strategies, and outline open challenges to inspire future research.
Hyper-modal Imputation Diffusion Embedding with Dual-Distillation for Federated Multimodal Knowledge Graph Completion
Zhang, Ying, Zhao, Yu, Sui, Xuhui, Zhou, Baohang, Cai, Xiangrui, Shen, Li, Yuan, Xiaojie, Tao, Dacheng
--With the increasing multimodal knowledge privatization requirements, multimodal knowledge graphs in different institutes are usually decentralized, lacking of effective collaboration system with both stronger reasoning ability and transmission safety guarantees. In this paper, we propose the Federated Multimodal Knowledge Graph Completion (FedMKGC) task, aiming at training over federated MKGs for better predicting the missing links in clients without sharing sensitive knowledge. We propose a framework named MMFeD3-HidE for addressing multimodal uncertain unavailability and multimodal client heterogeneity challenges of FedMKGC. We propose a FedMKGC benchmark for a comprehensive evaluation, consisting of a general FedMKGC backbone named MMFedE, datasets with heterogeneous multimodal information, and three groups of constructed baselines. Experiments conducted on our benchmark validate the effectiveness, semantic consistency, and convergence robustness of MMFeD3-HidE. Multimodal knowledge graphs (MKGs) [1], [2] organize graph structures composed of relational triples (head entity, relation, tail entity) and their visual and textual attributes as Figure 1, which have been widely in multimodal knowledge-intensive tasks [3]-[6]. Due to incomplete construction and new knowledge emergence, Multimodal Knowledge Graph Completion (MKGC) task [7], [8] has been widely-explored [9]-[12] to reason the missing links in MKGs with multimodal information. For example, in Figure 1, given a query (Kobe Bryant, T eam member,?), MKGC aims to predict the tail entity L.A. Lakers with graph structures, entity images, and descriptions. In real world, the MKGs are usually decentralized in different institutes due to commercial interests or data regulations, such as open-sourced MKGs DBpedia [13], Wiki-Ying Zhang, Y u Zhao, Xuhui Sui, Baohang Zhou, Xiangrui Cai, Xiaojie Y uan are with the College of Computer Science, VCIP, TMCC, TBI Center, DISSec, Nankai University, Tianjin, China (e-mail: yingzhang@nankai.edu.cn, Li Shen is with the Shenzhen Campus of Sun Y at-sen University, Shenzhen, China (e-mail: mathshenli@gmail.com).
medicX-KG: A Knowledge Graph for Pharmacists' Drug Information Needs
Farrugia, Lizzy, Azzopardi, Lilian M., Debattista, Jeremy, Abela, Charlie
The role of pharmacists is evolving from medicine dispensing to delivering comprehensive pharmaceutical services within multidisciplinary healthcare teams. Central to this shift is access to accurate, up-to-date medicinal product information supported by robust data integration. Leveraging artificial intelligence and semantic technologies, Knowledge Graphs (KGs) uncover hidden relationships and enable data-driven decision-making. This paper presents medicX-KG, a pharmacist-oriented knowledge graph supporting clinical and regulatory decisions. It forms the semantic layer of the broader medicX platform, powering predictive and explainable pharmacy services. medicX-KG integrates data from three sources, including, the British National Formulary (BNF), DrugBank, and the Malta Medicines Authority (MMA) that addresses Malta's regulatory landscape and combines European Medicines Agency alignment with partial UK supply dependence. The KG tackles the absence of a unified national drug repository, reducing pharmacists' reliance on fragmented sources. Its design was informed by interviews with practicing pharmacists to ensure real-world applicability. We detail the KG's construction, including data extraction, ontology design, and semantic mapping. Evaluation demonstrates that medicX-KG effectively supports queries about drug availability, interactions, adverse reactions, and therapeutic classes. Limitations, including missing detailed dosage encoding and real-time updates, are discussed alongside directions for future enhancements.
Grounding Language Models with Semantic Digital Twins for Robotic Planning
Naeem, Mehreen, Melnik, Andrew, Beetz, Michael
We introduce a novel framework that integrates Semantic Digital Twins (SDTs) with Large Language Models (LLMs) to enable adaptive and goal-driven robotic task execution in dynamic environments. The system decomposes natural language instructions into structured action triplets, which are grounded in contextual environmental data provided by the SDT. This semantic grounding allows the robot to interpret object affordances and interaction rules, enabling action planning and real-time adaptability. In case of execution failures, the LLM utilizes error feedback and SDT insights to generate recovery strategies and iteratively revise the action plan. We evaluate our approach using tasks from the ALFRED benchmark, demonstrating robust performance across various household scenarios. The proposed framework effectively combines high-level reasoning with semantic environment understanding, achieving reliable task completion in the face of uncertainty and failure.
Analyzing the Influence of Knowledge Graph Information on Relation Extraction
Möller, Cedric, Usbeck, Ricardo
We examine the impact of incorporating knowledge graph information on the performance of relation extraction models across a range of datasets. Our hypothesis is that the positions of entities within a knowledge graph provide important insights for relation extraction tasks. We conduct experiments on multiple datasets, each varying in the number of relations, training examples, and underlying knowledge graphs. Our results demonstrate that integrating knowledge graph information significantly enhances performance, especially when dealing with an imbalance in the number of training examples for each relation. We evaluate the contribution of knowledge graph-based features by combining established relation extraction methods with graph-aware Neural Bellman-Ford networks. These features are tested in both supervised and zero-shot settings, demonstrating consistent performance improvements across various datasets.