fb15k-237
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > United States > Illinois (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (3 more...)
- Research Report > Experimental Study (0.93)
- Workflow (0.68)
- Overview (0.67)
- Health & Medicine > Therapeutic Area (1.00)
- Information Technology (0.93)
- Leisure & Entertainment > Sports (0.92)
- Government (0.67)
- Asia > India (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- North America > Dominican Republic (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical Domains
Xiao, Yongkang, Zhang, Sinian, Dai, Yi, Zhou, Huixue, Hou, Jue, Ding, Jie, Zhang, Rui
Knowledge graph completion (KGC) aims to predict missing triples in knowledge graphs (KGs) by leveraging existing triples and textual information. Recently, generative large language models (LLMs) have been increasingly employed for graph tasks. However, current approaches typically encode graph context in textual form, which fails to fully exploit the potential of LLMs for perceiving and reasoning about graph structures. To address this limitation, we propose DrKGC (Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion). DrKGC employs a flexible lightweight model training strategy to learn structural embeddings and logical rules within the KG. It then leverages a novel bottom-up graph retrieval method to extract a subgraph for each query guided by the learned rules. Finally, a graph convolutional network (GCN) adapter uses the retrieved subgraph to enhance the structural embeddings, which are then integrated into the prompt for effective LLM fine-tuning. Experimental results on two general domain benchmark datasets and two biomedical datasets demonstrate the superior performance of DrKGC. Furthermore, a realistic case study in the biomedical domain highlights its interpretability and practical utility.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- (2 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > United States > Illinois (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (3 more...)
- Research Report > Experimental Study (0.93)
- Workflow (0.68)
- Overview (0.67)
- Health & Medicine > Therapeutic Area (1.00)
- Information Technology (0.93)
- Leisure & Entertainment > Sports (0.92)
- Government (0.67)
- North America > Dominican Republic (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Flock: A Knowledge Graph Foundation Model via Learning on Random Walks
Kim, Jinwoo, Huang, Xingyue, Olejniczak, Krzysztof, Min, Kyungbin, Bronstein, Michael, Hong, Seunghoon, Ceylan, İsmail İlkan
We study the problem of zero-shot link prediction on knowledge graphs (KGs), which requires models to generalize over novel entities and novel relations. Knowledge graph foundation models (KGFMs) address this task by enforcing equivariance over both nodes and relations, learning from structural properties of nodes and relations, which are then transferable to novel graphs with similar structural properties. However, the conventional notion of deterministic equivariance imposes inherent limits on the expressive power of KGFMs, preventing them from distinguishing structurally similar but semantically distinct relations. To overcome this limitation, we introduce probabilistic node-relation equivariance, which preserves equivariance in distribution while incorporating a principled randomization to break symmetries during inference. Building on this principle, we present Flock, a KGFM that iteratively samples random walks, encodes them into sequences via a recording protocol, embeds them with a sequence model, and aggregates representations of nodes and relations via learned pooling. Crucially, Flock respects probabilistic node-relation equivariance and is a universal approximator for isomorphism-invariant link-level functions over KGs. Empirically, Flock perfectly solves our new diagnostic dataset Petals where current KGFMs fail, and achieves state-of-the-art performances on entity- and relation prediction tasks on 54 KGs from diverse domains.
Evaluating Cumulative Spectral Gradient as a Complexity Measure
Gul, Haji, Naim, Abdul Ghani, Bhat, Ajaz Ahmad
Accurate estimation of dataset complexity is crucial for evaluating and comparing link-prediction models for knowledge graphs (KGs). The Cumulative Spectral Gradient (CSG) metric ( Branchaud-Charron et al., 2019) --derived from probabilistic divergence between classes within a spectral clustering framework-- was proposed as a dataset complexity measure that (1) naturally scales with the number of classes and (2) correlates strongly with downstream classification performance. In this work, we rigorously assess CSG's behavior on standard knowledge-graph link-prediction benchmarks--a multi-class tail-prediction task-- using two key parameters governing its computation: M, the number of Monte Carlo-sampled points per class, and K, the number of nearest neighbors in the embedding space. Contrary to the original claims, we find that (1) CSG is highly sensitive to the choice of K, thereby does not inherently scale with the number of target classes, and (2) CSG values exhibit weak or no correlation with established performance metrics such as mean reciprocal rank (MRR). Through experiments on FB15k-237, WN18RR, and other standard datasets, we demonstrate that CSG's purported stability and generalization-predictive power break down in link-prediction settings. Our results highlight the need for more robust, classifier-agnostic complexity measures in KG link-prediction evaluation.
- Asia > Brunei (0.15)
- North America > United States (0.04)
- North America > Canada (0.04)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.34)
RSCF: Relation-Semantics Consistent Filter for Entity Embedding of Knowledge Graph
Kim, Junsik, Park, Jinwook, Kim, Kangil
In knowledge graph embedding, leveraging relation specific entity transformation has markedly enhanced performance. However, the consistency of embedding differences before and after transformation remains unaddressed, risking the loss of valuable inductive bias inherent in the embeddings. This inconsistency stems from two problems. First, transformation representations are specified for relations in a disconnected manner, allowing dissimilar transformations and corresponding entity embeddings for similar relations. Second, a generalized plug-in approach as a SFBR (Semantic Filter Based on Relations) disrupts this consistency through excessive concentration of entity embeddings under entity-based regularization, generating indistinguishable score distributions among relations. In this paper, we introduce a plug-in KGE method, Relation-Semantics Consistent Filter (RSCF). Its entity transformation has three features for enhancing semantic consistency: 1) shared affine transformation of relation embeddings across all relations, 2) rooted entity transformation that adds an entity embedding to its change represented by the transformed vector, and 3) normalization of the change to prevent scale reduction. To amplify the advantages of consistency that preserve semantics on embeddings, RSCF adds relation transformation and prediction modules for enhancing the semantics. In knowledge graph completion tasks with distance-based and tensor decomposition models, RSCF significantly outperforms state-of-the-art KGE methods, showing robustness across all relations and their frequencies.
- Personal > Honors (0.95)
- Research Report (0.82)
- Media > Film (1.00)
- Leisure & Entertainment > Sports (0.94)
MuCo-KGC: Multi-Context-Aware Knowledge Graph Completion
Gul, Haji, Bhat, Ajaz Ahmad, Naim, Abdul Ghani Haji
Knowledge graph completion (KGC) seeks to predict missing entities (e.g., heads or tails) or relationships in knowledge graphs (KGs), which often contain incomplete data. Traditional embedding-based methods, such as TransE and ComplEx, have improved tail entity prediction but struggle to generalize to unseen entities during testing. Textual-based models mitigate this issue by leveraging additional semantic context; however, their reliance on negative triplet sampling introduces high computational overhead, semantic inconsistencies, and data imbalance. Recent approaches, like KG-BERT, show promise but depend heavily on entity descriptions, which are often unavailable in KGs. Critically, existing methods overlook valuable structural information in the KG related to the entities and relationships. To address these challenges, we propose Multi-Context-Aware Knowledge Graph Completion (MuCo-KGC), a novel model that utilizes contextual information from linked entities and relations within the graph to predict tail entities. MuCo-KGC eliminates the need for entity descriptions and negative triplet sampling, significantly reducing computational complexity while enhancing performance. Our experiments on standard datasets, including FB15k-237, WN18RR, CoDEx-S, and CoDEx-M, demonstrate that MuCo-KGC outperforms state-of-the-art methods on three datasets. Notably, MuCo-KGC improves MRR on WN18RR, and CoDEx-S and CoDEx-M datasets by $1.63\%$, and $3.77\%$ and $20.15\%$ respectively, demonstrating its effectiveness for KGC tasks.