Semantic Networks
NQE: N-ary Query Embedding for Complex Query Answering over Hyper-Relational Knowledge Graphs
Luo, Haoran, E, Haihong, Yang, Yuhao, Zhou, Gengxian, Guo, Yikai, Yao, Tianyu, Tang, Zichen, Lin, Xueyuan, Wan, Kaiyang
Complex query answering (CQA) is an essential task for multi-hop and logical reasoning on knowledge graphs (KGs). Currently, most approaches are limited to queries among binary relational facts and pay less attention to n-ary facts (n>=2) containing more than two entities, which are more prevalent in the real world. Moreover, previous CQA methods can only make predictions for a few given types of queries and cannot be flexibly extended to more complex logical queries, which significantly limits their applications. To overcome these challenges, in this work, we propose a novel N-ary Query Embedding (NQE) model for CQA over hyper-relational knowledge graphs (HKGs), which include massive n-ary facts. The NQE utilizes a dual-heterogeneous Transformer encoder and fuzzy logic theory to satisfy all n-ary FOL queries, including existential quantifiers, conjunction, disjunction, and negation. We also propose a parallel processing algorithm that can train or predict arbitrary n-ary FOL queries in a single batch, regardless of the kind of each query, with good flexibility and extensibility. In addition, we generate a new CQA dataset WD50K-NFOL, including diverse n-ary FOL queries over WD50K. Experimental results on WD50K-NFOL and other standard CQA datasets show that NQE is the state-of-the-art CQA method over HKGs with good generalization capability. Our code and dataset are publicly available.
DHGE: Dual-View Hyper-Relational Knowledge Graph Embedding for Link Prediction and Entity Typing
Luo, Haoran, E, Haihong, Tan, Ling, Zhou, Gengxian, Yao, Tianyu, Wan, Kaiyang
In the field of representation learning on knowledge graphs (KGs), a hyper-relational fact consists of a main triple and several auxiliary attribute-value descriptions, which is considered more comprehensive and specific than a triple-based fact. However, currently available hyper-relational KG embedding methods in a single view are limited in application because they weaken the hierarchical structure that represents the affiliation between entities. To overcome this limitation, we propose a dual-view hyper-relational KG structure (DH-KG) that contains a hyper-relational instance view for entities and a hyper-relational ontology view for concepts that are abstracted hierarchically from the entities. This paper defines link prediction and entity typing tasks on DH-KG for the first time and constructs two DH-KG datasets, JW44K-6K, extracted from Wikidata, and HTDM based on medical data. Furthermore, we propose DHGE, a DH-KG embedding model based on GRAN encoders, HGNNs, and joint learning. DHGE outperforms baseline models on DH-KG, according to experimental results. Finally, we provide an example of how this technology can be used to treat hypertension. Our model and new datasets are publicly available.
Building a Knowledge Graph of Distributed Ledger Technologies
Kรถnig, Lukas, Neumaier, Sebastian
Distributed ledger systems have become more prominent and successful in recent years, with a focus on blockchains and cryptocurrency. This has led to various misunderstandings about both the technology itself and its capabilities, as in many cases blockchain and cryptocurrency is used synonymously and other applications are often overlooked. Therefore, as a whole, the view of distributed ledger technology beyond blockchains and cryptocurrencies is very limited. Existing vocabularies and ontologies often focus on single aspects of the technology, or in some cases even just on one product. This potentially leads to other types of distributed ledgers and their possible use cases being neglected. In this paper, we present a knowledge graph and an ontology for distributed ledger technologies, which includes security considerations to model aspects such as threats and vulnerabilities, application domains, as well as relevant standards and regulations. Such a knowledge graph improves the overall understanding of distributed ledgers, reveals their strengths, and supports the work of security personnel, i.e. analysts and system architects. We discuss potential uses and follow semantic web best practices to evaluate and publish the ontology and knowledge graph.
GETT-QA: Graph Embedding based T2T Transformer for Knowledge Graph Question Answering
Banerjee, Debayan, Nair, Pranav Ajit, Usbeck, Ricardo, Biemann, Chris
In this work, we present an end-to-end Knowledge Graph Question Answering (KGQA) system named GETT-QA. GETT-QA uses T5, a popular text-to-text pre-trained language model. The model takes a question in natural language as input and produces a simpler form of the intended SPARQL query. In the simpler form, the model does not directly produce entity and relation IDs. Instead, it produces corresponding entity and relation labels. The labels are grounded to KG entity and relation IDs in a subsequent step. To further improve the results, we instruct the model to produce a truncated version of the KG embedding for each entity. The truncated KG embedding enables a finer search for disambiguation purposes. We find that T5 is able to learn the truncated KG embeddings without any change of loss function, improving KGQA performance. As a result, we report strong results for LC-QuAD 2.0 and SimpleQuestions-Wikidata datasets on end-to-end KGQA over Wikidata.
Pre-training Transformers for Knowledge Graph Completion
Chen, Sanxing, Cheng, Hao, Liu, Xiaodong, Jiao, Jian, Ji, Yangfeng, Gao, Jianfeng
Co-training LMs and KG completion As a fundamental component of human intelligence, models has been shown to be effective in improving relational knowledge plays a crucial role the performance of downstream knowledgeintensive in imitating human cognitive abilities with machine NLP tasks, but not so much for the KG learning (Halford et al., 2010). Knowledge completion task itself (Wang et al., 2021; Yasunaga graphs (KGs) are the most widely used representation et al., 2022). Despite the progress on transferring of relational knowledge, with well-known knowledge between structured KGs and unstructured examples such as Freebase (Bollacker et al., 2008), texts, the generalization from one KG to another YAGO (Suchanek et al., 2007), and Wikidata (Vrandeฤiฤ is still an open problem that is rarely studied and Krรถtzsch, 2014). KG is also a key ingredient (Kocijan and Lukasiewicz, 2021).
Joint embedding in Hierarchical distance and semantic representation learning for link prediction
Liu, Jin, Chen, Jianye, Fan, Chongfeng, Zhou, Fengyu
The link prediction task aims to predict missing entities or relations in the knowledge graph and is essential for the downstream application. Existing well-known models deal with this task by mainly focusing on representing knowledge graph triplets in the distance space or semantic space. However, they can not fully capture the information of head and tail entities, nor even make good use of hierarchical level information. Thus, in this paper, we propose a novel knowledge graph embedding model for the link prediction task, namely, HIE, which models each triplet (\textit{h}, \textit{r}, \textit{t}) into distance measurement space and semantic measurement space, simultaneously. Moreover, HIE is introduced into hierarchical-aware space to leverage rich hierarchical information of entities and relations for better representation learning. Specifically, we apply distance transformation operation on the head entity in distance space to obtain the tail entity instead of translation-based or rotation-based approaches. Experimental results of HIE on four real-world datasets show that HIE outperforms several existing state-of-the-art knowledge graph embedding methods on the link prediction task and deals with complex relations accurately.
Using Graph Algorithms to Pretrain Graph Completion Transformers
Pilault, Jonathan, Galkin, Michael, Fatemi, Bahare, Taslakian, Perouz, Vasquez, David, Pal, Christopher
Recent work on Graph Neural Networks has demonstrated that self-supervised pretraining can further enhance performance on downstream graph, link, and node classification tasks. However, the efficacy of pretraining tasks has not been fully investigated for downstream large knowledge graph completion tasks. Using a contextualized knowledge graph embedding approach, we investigate five different pretraining signals, constructed using several graph algorithms and no external data, as well as their combination. We leverage the versatility of our Transformer-based model to explore graph structure generation pretraining tasks (i.e. path and k-hop neighborhood generation), typically inapplicable to most graph embedding methods. We further propose a new path-finding algorithm guided by information gain and find that it is the best-performing pretraining task across three downstream knowledge graph completion datasets. While using our new path-finding algorithm as a pretraining signal provides 2-3% MRR improvements, we show that pretraining on all signals together gives the best knowledge graph completion results. In a multitask setting that combines all pretraining tasks, our method surpasses the latest and strong performing knowledge graph embedding methods on all metrics for FB15K-237, on MRR and Hit@1 for WN18RRand on MRR and hit@10 for JF17K (a knowledge hypergraph dataset).
Expanding Knowledge Graphs with Humans in the Loop
Manzoor, Emaad, Tong, Jordan, Vijayaraghavan, Sriniketh, Li, Rui
Curated knowledge graphs encode domain expertise and improve the performance of recommendation, segmentation, ad targeting, and other machine learning systems in several domains. As new concepts emerge in a domain, knowledge graphs must be expanded to preserve machine learning performance. Manually expanding knowledge graphs, however, is infeasible at scale. In this work, we propose a method for knowledge graph expansion with humans-in-the-loop. Concretely, given a knowledge graph, our method predicts the "parents" of new concepts to be added to this graph for further verification by human experts. We show that our method is both accurate and provably "human-friendly". Specifically, we prove that our method predicts parents that are "near" concepts' true parents in the knowledge graph, even when the predictions are incorrect. We then show, with a controlled experiment, that satisfying this property increases both the speed and the accuracy of the human-algorithm collaboration. We further evaluate our method on a knowledge graph from Pinterest and show that it outperforms competing methods on both accuracy and human-friendliness. Upon deployment in production at Pinterest, our method reduced the time needed for knowledge graph expansion by ~400% (compared to manual expansion), and contributed to a subsequent increase in ad revenue of 20%.
Farspredict: A benchmark dataset for link prediction
Torabian, Najmeh, Minaei-Bidgoli, Behrouz, Jahanshahi, Mohsen
Knowledge graphs have received much attention in recent years due to their applications that offer significant economic benefits. A Knowledge graph contains the knowledge obtained from the sources, including texts and tables. It has many applications in natural language processing and has been investigated as a potential reasoning source for explainable artificial intelligence. Although the impact of creating knowledge graphs in non-English languages has been explored recently, little attention has been paid to preparing a suitable knowledge graph for use in the link prediction field. At the same time, one of the main reasons that significant progress has yet to be made in Persian reasoning, recommendation systems, and other similar fields is the need for a proper knowledge graph in these languages. Although some attempts have been made to construct a Persian knowledge graph, the most successful is the Farsbase project. By applying Farsbase for link prediction through KGE models, we realized it is too weak to be used for link prediction. In approach to state-of-the-art link prediction methods, we come to the KGE methods. These methods were introduced with TransE, which falls into translational distance models.
Mutually-paced Knowledge Distillation for Cross-lingual Temporal Knowledge Graph Reasoning
Wang, Ruijie, Li, Zheng, Yang, Jingfeng, Cao, Tianyu, Zhang, Chao, Yin, Bing, Abdelzaher, Tarek
This paper investigates cross-lingual temporal knowledge graph reasoning problem, which aims to facilitate reasoning on Temporal Knowledge Graphs (TKGs) in low-resource languages by transfering knowledge from TKGs in high-resource ones. The cross-lingual distillation ability across TKGs becomes increasingly crucial, in light of the unsatisfying performance of existing reasoning methods on those severely incomplete TKGs, especially in low-resource languages. However, it poses tremendous challenges in two aspects. First, the cross-lingual alignments, which serve as bridges for knowledge transfer, are usually too scarce to transfer sufficient knowledge between two TKGs. Second, temporal knowledge discrepancy of the aligned entities, especially when alignments are unreliable, can mislead the knowledge distillation process. We correspondingly propose a mutually-paced knowledge distillation model MP-KD, where a teacher network trained on a source TKG can guide the training of a student network on target TKGs with an alignment module. Concretely, to deal with the scarcity issue, MP-KD generates pseudo alignments between TKGs based on the temporal information extracted by our representation module. To maximize the efficacy of knowledge transfer and control the noise caused by the temporal knowledge discrepancy, we enhance MP-KD with a temporal cross-lingual attention mechanism to dynamically estimate the alignment strength. The two procedures are mutually paced along with model training. Extensive experiments on twelve cross-lingual TKG transfer tasks in the EventKG benchmark demonstrate the effectiveness of the proposed MP-KD method.