Goto

Collaborating Authors

 Xu, Zezhong


Start from Zero: Triple Set Prediction for Automatic Knowledge Graph Completion

arXiv.org Artificial Intelligence

Knowledge graph (KG) completion aims to find out missing triples in a KG. Some tasks, such as link prediction and instance completion, have been proposed for KG completion. They are triple-level tasks with some elements in a missing triple given to predict the missing element of the triple. However, knowing some elements of the missing triple in advance is not always a realistic setting. In this paper, we propose a novel graph-level automatic KG completion task called Triple Set Prediction (TSP) which assumes none of the elements in the missing triples is given. TSP is to predict a set of missing triples given a set of known triples. To properly and accurately evaluate this new task, we propose 4 evaluation metrics including 3 classification metrics and 1 ranking metric, considering both the partial-open-world and the closed-world assumptions. Furthermore, to tackle the huge candidate triples for prediction, we propose a novel and efficient subgraph-based method GPHT that can predict the triple set fast. To fairly compare the TSP results, we also propose two types of methods RuleTensor-TSP and KGE-TSP applying the existing rule- and embedding-based methods for TSP as baselines. During experiments, we evaluate the proposed methods on two datasets extracted from Wikidata following the relation-similarity partial-open-world assumption proposed by us, and also create a complete family data set to evaluate TSP results following the closed-world assumption. Results prove that the methods can successfully generate a set of missing triples and achieve reasonable scores on the new task, and GPHT performs better than the baselines with significantly shorter prediction time. The datasets and code for experiments are available at https://github.com/zjukg/GPHT-for-TSP.


Prompt-fused framework for Inductive Logical Query Answering

arXiv.org Artificial Intelligence

Answering logical queries on knowledge graphs (KG) poses a significant challenge for machine reasoning. The primary obstacle in this task stems from the inherent incompleteness of KGs. Existing research has predominantly focused on addressing the issue of missing edges in KGs, thereby neglecting another aspect of incompleteness: the emergence of new entities. Furthermore, most of the existing methods tend to reason over each logical operator separately, rather than comprehensively analyzing the query as a whole during the reasoning process. In this paper, we propose a query-aware prompt-fused framework named Pro-QE, which could incorporate existing query embedding methods and address the embedding of emerging entities through contextual information aggregation. Additionally, a query prompt, which is generated by encoding the symbolic query, is introduced to gather information relevant to the query from a holistic perspective. To evaluate the efficacy of our model in the inductive setting, we introduce two new challenging benchmarks. Experimental results demonstrate that our model successfully handles the issue of unseen entities in logical queries. Furthermore, the ablation study confirms the efficacy of the aggregator and prompt components.


InBox: Recommendation with Knowledge Graph using Interest Box Embedding

arXiv.org Artificial Intelligence

Knowledge graphs (KGs) have become vitally important in modern recommender systems, effectively improving performance and interpretability. Fundamentally, recommender systems aim to identify user interests based on historical interactions and recommend suitable items. However, existing works overlook two key challenges: (1) an interest corresponds to a potentially large set of related items, and (2) the lack of explicit, fine-grained exploitation of KG information and interest connectivity. This leads to an inability to reflect distinctions between entities and interests when modeling them in a single way. Additionally, the granularity of concepts in the knowledge graphs used for recommendations tends to be coarse, failing to match the fine-grained nature of user interests. This homogenization limits the precise exploitation of knowledge graph data and interest connectivity. To address these limitations, we introduce a novel embedding-based model called InBox. Specifically, various knowledge graph entities and relations are embedded as points or boxes, while user interests are modeled as boxes encompassing interaction history. Representing interests as boxes enables containing collections of item points related to that interest. We further propose that an interest comprises diverse basic concepts, and box intersection naturally supports concept combination. Across three training steps, InBox significantly outperforms state-of-the-art methods like HAKG and KGIN on recommendation tasks. Further analysis provides meaningful insights into the variable value of different KG data for recommendations. In summary, InBox advances recommender systems through box-based interest and concept modeling for sophisticated knowledge graph exploitation.


Generalizing to Unseen Elements: A Survey on Knowledge Extrapolation for Knowledge Graphs

arXiv.org Artificial Intelligence

Knowledge graphs (KGs) have become valuable knowledge resources in various applications, and knowledge graph embedding (KGE) methods have garnered increasing attention in recent years. However, conventional KGE methods still face challenges when it comes to handling unseen entities or relations during model testing. To address this issue, much effort has been devoted to various fields of KGs. In this paper, we use a set of general terminologies to unify these methods and refer to them collectively as Knowledge Extrapolation. We comprehensively summarize these methods, classified by our proposed taxonomy, and describe their interrelationships. Additionally, we introduce benchmarks and provide comparisons of these methods based on aspects that are not captured by the taxonomy. Finally, we suggest potential directions for future research.


Commonsense Knowledge Salience Evaluation with a Benchmark Dataset in E-commerce

arXiv.org Artificial Intelligence

In e-commerce, the salience of commonsense knowledge (CSK) is beneficial for widespread applications such as product search and recommendation. For example, when users search for ``running'' in e-commerce, they would like to find products highly related to running, such as ``running shoes'' rather than ``shoes''. Nevertheless, many existing CSK collections rank statements solely by confidence scores, and there is no information about which ones are salient from a human perspective. In this work, we define the task of supervised salience evaluation, where given a CSK triple, the model is required to learn whether the triple is salient or not. In addition to formulating the new task, we also release a new Benchmark dataset of Salience Evaluation in E-commerce (BSEE) and hope to promote related research on commonsense knowledge salience evaluation. We conduct experiments in the dataset with several representative baseline models. The experimental results show that salience evaluation is a challenging task where models perform poorly on our evaluation set. We further propose a simple but effective approach, PMI-tuning, which shows promise for solving this novel problem. Code is available in \url{https://github.com/OpenBGBenchmark/OpenBG-CSK.