Semantic Networks
Knowledge Graph Reasoning over Entities and Numerical Values
Bai, Jiaxin, Luo, Chen, Li, Zheng, Yin, Qingyu, Yin, Bing, Song, Yangqiu
A complex logic query in a knowledge graph refers to a query expressed in logic form that conveys a complex meaning, such as where did the Canadian Turing award winner graduate from? Knowledge graph reasoning-based applications, such as dialogue systems and interactive search engines, rely on the ability to answer complex logic queries as a fundamental task. In most knowledge graphs, edges are typically used to either describe the relationships between entities or their associated attribute values. An attribute value can be in categorical or numerical format, such as dates, years, sizes, etc. However, existing complex query answering (CQA) methods simply treat numerical values in the same way as they treat entities. This can lead to difficulties in answering certain queries, such as which Australian Pulitzer award winner is born before 1927, and which drug is a pain reliever and has fewer side effects than Paracetamol. In this work, inspired by the recent advances in numerical encoding and knowledge graph reasoning, we propose numerical complex query answering. In this task, we introduce new numerical variables and operations to describe queries involving numerical attribute values. To address the difference between entities and numerical values, we also propose the framework of Number Reasoning Network (NRN) for alternatively encoding entities and numerical values into separate encoding structures. During the numerical encoding process, NRN employs a parameterized density function to encode the distribution of numerical values. During the entity encoding process, NRN uses established query encoding methods for the original CQA problem. Experimental results show that NRN consistently improves various query encoding methods on three different knowledge graphs and achieves state-of-the-art results.
BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models
Hao, Shibo, Tan, Bowen, Tang, Kaiwen, Ni, Bin, Shao, Xiyan, Zhang, Hengzhe, Xing, Eric P., Hu, Zhiting
It is crucial to automatically construct knowledge graphs (KGs) of diverse new relations to support knowledge discovery and broad applications. Previous KG construction methods, based on either crowdsourcing or text mining, are often limited to a small predefined set of relations due to manual cost or restrictions in text corpus. Recent research proposed to use pretrained language models (LMs) as implicit knowledge bases that accept knowledge queries with prompts. Yet, the implicit knowledge lacks many desirable properties of a full-scale symbolic KG, such as easy access, navigation, editing, and quality assurance. In this paper, we propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs. With minimal input of a relation definition (a prompt and a few shot of example entity pairs), the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge of the desired relation. We develop an effective search-and-rescore mechanism for improved efficiency and accuracy. We deploy the approach to harvest KGs of over 400 new relations from different LMs. Extensive human and automatic evaluations show our approach manages to extract diverse accurate knowledge, including tuples of complex relations (e.g., "A is capable of but not good at B"). The resulting KGs as a symbolic interpretation of the source LMs also reveal new insights into the LMs' knowledge capacities.
Knowledge Graph Embedding with Electronic Health Records Data via Latent Graphical Block Model
Lu, Junwei, Yin, Jin, Cai, Tianxi
Due to the increasing adoption of electronic health records (EHR), large scale EHRs have become another rich data source for translational clinical research. Despite its potential, deriving generalizable knowledge from EHR data remains challenging. First, EHR data are generated as part of clinical care with data elements too detailed and fragmented for research. Despite recent progress in mapping EHR data to common ontology with hierarchical structures, much development is still needed to enable automatic grouping of local EHR codes to meaningful clinical concepts at a large scale. Second, the total number of unique EHR features is large, imposing methodological challenges to derive reproducible knowledge graph, especially when interest lies in conditional dependency structure. Third, the detailed EHR data on a very large patient cohort imposes additional computational challenge to deriving a knowledge network. To overcome these challenges, we propose to infer the conditional dependency structure among EHR features via a latent graphical block model (LGBM). The LGBM has a two layer structure with the first providing semantic embedding vector (SEV) representation for the EHR features and the second overlaying a graphical block model on the latent SEVs. The block structures on the graphical model also allows us to cluster synonymous features in EHR. We propose to learn the LGBM efficiently, in both statistical and computational sense, based on the empirical point mutual information matrix. We establish the statistical rates of the proposed estimators and show the perfect recovery of the block structure. Numerical results from simulation studies and real EHR data analyses suggest that the proposed LGBM estimator performs well in finite sample.
Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings
Luo, Haochen, Zhou, Yi, Bollegala, Danushka
Sense embedding learning methods learn multiple vectors for a given ambiguous word, corresponding to its different word senses. For this purpose, different methods have been proposed in prior work on sense embedding learning that use different sense inventories, sense-tagged corpora and learning methods. However, not all existing sense embeddings cover all senses of ambiguous words equally well due to the discrepancies in their training resources. To address this problem, we propose the first-ever meta-sense embedding method -- Neighbour Preserving Meta-Sense Embeddings, which learns meta-sense embeddings by combining multiple independently trained source sense embeddings such that the sense neighbourhoods computed from the source embeddings are preserved in the meta-embedding space. Our proposed method can combine source sense embeddings that cover different sets of word senses. Experimental results on Word Sense Disambiguation (WSD) and Word-in-Context (WiC) tasks show that the proposed meta-sense embedding method consistently outperforms several competitive baselines.
History Repeats: Overcoming Catastrophic Forgetting For Event-Centric Temporal Knowledge Graph Completion
Mirtaheri, Mehrnoosh, Rostami, Mohammad, Galstyan, Aram
Temporal knowledge graph (TKG) completion models typically rely on having access to the entire graph during training. However, in real-world scenarios, TKG data is often received incrementally as events unfold, leading to a dynamic non-stationary data distribution over time. While one could incorporate fine-tuning to existing methods to allow them to adapt to evolving TKG data, this can lead to forgetting previously learned patterns. Alternatively, retraining the model with the entire updated TKG can mitigate forgetting but is computationally burdensome. To address these challenges, we propose a general continual training framework that is applicable to any TKG completion method, and leverages two key ideas: (i) a temporal regularization that encourages repurposing of less important model parameters for learning new knowledge, and (ii) a clustering-based experience replay that reinforces the past knowledge by selectively preserving only a small portion of the past data. Our experimental results on widely used event-centric TKG datasets demonstrate the effectiveness of our proposed continual training framework in adapting to new events while reducing catastrophic forgetting. Further, we perform ablation studies to show the effectiveness of each component of our proposed framework. Finally, we investigate the relation between the memory dedicated to experience replay and the benefit gained from our clustering-based sampling strategy.
Dense-ATOMIC: Towards Densely-connected ATOMIC with High Knowledge Coverage and Massive Multi-hop Paths
Shen, Xiangqing, Wu, Siwei, Xia, Rui
ATOMIC is a large-scale commonsense knowledge graph (CSKG) containing everyday if-then knowledge triplets, i.e., {head event, relation, tail event}. The one-hop annotation manner made ATOMIC a set of independent bipartite graphs, which ignored the numerous links between events in different bipartite graphs and consequently caused shortages in knowledge coverage and multi-hop paths. In this work, we aim to construct Dense-ATOMIC with high knowledge coverage and massive multi-hop paths. The events in ATOMIC are normalized to a consistent pattern at first. We then propose a CSKG completion method called Rel-CSKGC to predict the relation given the head event and the tail event of a triplet, and train a CSKG completion model based on existing triplets in ATOMIC. We finally utilize the model to complete the missing links in ATOMIC and accordingly construct Dense-ATOMIC. Both automatic and human evaluation on an annotated subgraph of ATOMIC demonstrate the advantage of Rel-CSKGC over strong baselines. We further conduct extensive evaluations on Dense-ATOMIC in terms of statistics, human evaluation, and simple downstream tasks, all proving Dense-ATOMIC's advantages in Knowledge Coverage and Multi-hop Paths. Both the source code of Rel-CSKGC and Dense-ATOMIC are publicly available on https://github.com/NUSTM/Dense-ATOMIC.
Commonsense Knowledge Graph Completion Via Contrastive Pretraining and Node Clustering
Wu, Siwei, Shen, Xiangqing, Xia, Rui
The nodes in the commonsense knowledge graph (CSKG) are normally represented by free-form short text (e.g., word or phrase). Different nodes may represent the same concept. This leads to the problems of edge sparsity and node redundancy, which challenges CSKG representation and completion. On the one hand, edge sparsity limits the performance of graph representation learning; On the other hand, node redundancy makes different nodes corresponding to the same concept have inconsistent relations with other nodes. To address the two problems, we propose a new CSKG completion framework based on Contrastive Pretraining and Node Clustering (CPNC). Contrastive Pretraining constructs positive and negative head-tail node pairs on CSKG and utilizes contrastive learning to obtain better semantic node representation. Node Clustering aggregates nodes with the same concept into a latent concept, assisting the task of CSKG completion. We evaluate our CPNC approach on two CSKG completion benchmarks (CN-100K and ATOMIC), where CPNC outperforms the state-of-the-art methods. Extensive experiments demonstrate that both Contrastive Pretraining and Node Clustering can significantly improve the performance of CSKG completion. The source code of CPNC is publicly available on \url{https://github.com/NUSTM/CPNC}.
Collective Knowledge Graph Completion with Mutual Knowledge Distillation
Zhang, Weihang, Serban, Ovidiu, Sun, Jiahao, Guo, Yi-ke
Knowledge graph completion (KGC), the task of predicting missing information based on the existing relational data inside a knowledge graph (KG), has drawn significant attention in recent years. However, the predictive power of KGC methods is often limited by the completeness of the existing knowledge graphs from different sources and languages. In monolingual and multilingual settings, KGs are potentially complementary to each other. In this paper, we study the problem of multi-KG completion, where we focus on maximizing the collective knowledge from different KGs to alleviate the incompleteness of individual KGs. Specifically, we propose a novel method called CKGC-CKD that uses relation-aware graph convolutional network encoder models on both individual KGs and a large fused KG in which seed alignments between KGs are regarded as edges for message propagation. An additional mutual knowledge distillation mechanism is also employed to maximize the knowledge transfer between the models of "global" fused KG and the "local" individual KGs. Experimental results on multilingual datasets have shown that our method outperforms all state-of-the-art models in the KGC task.
Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language Models
Jiang, Pengcheng, Agarwal, Shivam, Jin, Bowen, Wang, Xuan, Sun, Jimeng, Han, Jiawei
The mission of open knowledge graph (KG) completion is to draw new findings from known facts. Existing works that augment KG completion require either (1) factual triples to enlarge the graph reasoning space or (2) manually designed prompts to extract knowledge from a pre-trained language model (PLM), exhibiting limited performance and requiring expensive efforts from experts. To this end, we propose TAGREAL that automatically generates quality query prompts and retrieves support information from large text corpora to probe knowledge from PLM for KG completion. The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets. We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods.
Learning Meta Representations of One-shot Relations for Temporal Knowledge Graph Link Prediction
Ding, Zifeng, He, Bailan, Ma, Yunpu, Han, Zhen, Tresp, Volker
Few-shot relational learning for static knowledge graphs (KGs) has drawn greater interest in recent years, while few-shot learning for temporal knowledge graphs (TKGs) has hardly been studied. Compared to KGs, TKGs contain rich temporal information, thus requiring temporal reasoning techniques for modeling. This poses a greater challenge in learning few-shot relations in the temporal context. In this paper, we follow the previous work that focuses on few-shot relational learning on static KGs and extend two fundamental TKG reasoning tasks, i.e., interpolated and extrapolated link prediction, to the one-shot setting. We propose four new large-scale benchmark datasets and develop a TKG reasoning model for learning one-shot relations in TKGs. Experimental results show that our model can achieve superior performance on all datasets in both TKG link prediction tasks.