Goto

Collaborating Authors

 wn18rr





0c72cb7ee1512f800abe27823a792d03-AuthorFeedback.pdf

Neural Information Processing Systems

We thank all the reviewers for their comments about the novelty and significance of the work. Below we address reviewers' two common comments. We agree that we didn't Reviewer 1: We really appreciate your thoughtful and detailed comments. Please find in the following our responses. Will be appended to Table 2 of paper.


Evaluating Cumulative Spectral Gradient as a Complexity Measure

Gul, Haji, Naim, Abdul Ghani, Bhat, Ajaz Ahmad

arXiv.org Artificial Intelligence

Accurate estimation of dataset complexity is crucial for evaluating and comparing link-prediction models for knowledge graphs (KGs). The Cumulative Spectral Gradient (CSG) metric ( Branchaud-Charron et al., 2019) --derived from probabilistic divergence between classes within a spectral clustering framework-- was proposed as a dataset complexity measure that (1) naturally scales with the number of classes and (2) correlates strongly with downstream classification performance. In this work, we rigorously assess CSG's behavior on standard knowledge-graph link-prediction benchmarks--a multi-class tail-prediction task-- using two key parameters governing its computation: M, the number of Monte Carlo-sampled points per class, and K, the number of nearest neighbors in the embedding space. Contrary to the original claims, we find that (1) CSG is highly sensitive to the choice of K, thereby does not inherently scale with the number of target classes, and (2) CSG values exhibit weak or no correlation with established performance metrics such as mean reciprocal rank (MRR). Through experiments on FB15k-237, WN18RR, and other standard datasets, we demonstrate that CSG's purported stability and generalization-predictive power break down in link-prediction settings. Our results highlight the need for more robust, classifier-agnostic complexity measures in KG link-prediction evaluation.


An ablation study over different model architectures (Table (a)) shows that the chosen

Neural Information Processing Systems

FB15k's lack of hierarchy offers no advantage to hyperbolic embeddings, but its large number MuRP does not also set out to include MTL, but we hope to address this in future work. We will include all recommendations, e.g. However, we agree that it is important to compare models across a range of dimensionalities. Note that for MuRP with biases replaced by (transformed) norms, performance reduces (e.g. Multi-relational transforms and Justification for architecture: See "Architecture ablation study".


A Theoretical and empirical evidence for ConE's design choice

Neural Information Processing Systems

Here we provide theoretical and empirical results to support that ConE's design choice makes sense, i.e., both rotation transformation and restricted transformation play a crucial role to the expressiveness of the model. A.1 Proof for transformations A.1.1 Proof for rotation transformation We will show that the rotation transformation in Eq. 10 can model all relation patterns that can be modeled by its Euclidean counterpart RotatE [7]. Three most common relation patterns are discussed in [7], including symmetry pattern, inverse pattern and composition pattern. Let T denote the set of all true triples. We formally define the three relation patterns as follows.


Finetuning Generative Large Language Models with Discrimination Instructions for Knowledge Graph Completion

Liu, Yang, Tian, Xiaobin, Sun, Zequn, Hu, Wei

arXiv.org Artificial Intelligence

Traditional knowledge graph (KG) completion models learn embeddings to predict missing facts. Recent works attempt to complete KGs in a text-generation manner with large language models (LLMs). However, they need to ground the output of LLMs to KG entities, which inevitably brings errors. In this paper, we present a finetuning framework, DIFT, aiming to unleash the KG completion ability of LLMs and avoid grounding errors. Given an incomplete fact, DIFT employs a lightweight model to obtain candidate entities and finetunes an LLM with discrimination instructions to select the correct one from the given candidates. To improve performance while reducing instruction data, DIFT uses a truncated sampling method to select useful facts for finetuning and injects KG embeddings into the LLM. Extensive experiments on benchmark datasets demonstrate the effectiveness of our proposed framework.


Subgraph-Aware Training of Text-based Methods for Knowledge Graph Completion

Ko, Youmin, Yang, Hyemin, Kim, Taeuk, Kim, Hyunjoon

arXiv.org Artificial Intelligence

Fine-tuning pre-trained language models (PLMs) has recently shown a potential to improve knowledge graph completion (KGC). However, most PLM-based methods encode only textual information, neglecting various topological structures of knowledge graphs (KGs). In this paper, we empirically validate the significant relations between the structural properties of KGs and the performance of the PLM-based methods. To leverage the structural knowledge, we propose a Subgraph-Aware Training framework for KGC (SATKGC) that combines (i) subgraph-aware mini-batching to encourage hard negative sampling, and (ii) a new contrastive learning method to focus more on harder entities and harder negative triples in terms of the structural properties. To the best of our knowledge, this is the first study to comprehensively incorporate the structural inductive bias of the subgraphs into fine-tuning PLMs. Extensive experiments on four KGC benchmarks demonstrate the superiority of SATKGC. Our code is available.


Unified Interpretation of Smoothing Methods for Negative Sampling Loss Functions in Knowledge Graph Embedding

Feng, Xincan, Kamigaito, Hidetaka, Hayashi, Katsuhiko, Watanabe, Taro

arXiv.org Artificial Intelligence

Knowledge Graphs (KGs) are fundamental resources in knowledge-intensive tasks in NLP. Due to the limitation of manually creating KGs, KG Completion (KGC) has an important role in automatically completing KGs by scoring their links with KG Embedding (KGE). To handle many entities in training, KGE relies on Negative Sampling (NS) loss that can reduce the computational cost by sampling. Since the appearance frequencies for each link are at most one in KGs, sparsity is an essential and inevitable problem. The NS loss is no exception. As a solution, the NS loss in KGE relies on smoothing methods like Self-Adversarial Negative Sampling (SANS) and subsampling. However, it is uncertain what kind of smoothing method is suitable for this purpose due to the lack of theoretical understanding. This paper provides theoretical interpretations of the smoothing methods for the NS loss in KGE and induces a new NS loss, Triplet Adaptive Negative Sampling (TANS), that can cover the characteristics of the conventional smoothing methods. Experimental results of TransE, DistMult, ComplEx, RotatE, HAKE, and HousE on FB15k-237, WN18RR, and YAGO3-10 datasets and their sparser subsets show the soundness of our interpretation and performance improvement by our TANS.