Goto

Collaborating Authors

 Semantic Networks


Explainable Sparse Knowledge Graph Completion via High-order Graph Reasoning Network

arXiv.org Artificial Intelligence

Knowledge Graphs (KGs) are becoming increasingly essential infrastructures in many applications while suffering from incompleteness issues. The KG completion task (KGC) automatically predicts missing facts based on an incomplete KG. However, existing methods perform unsatisfactorily in real-world scenarios. On the one hand, their performance will dramatically degrade along with the increasing sparsity of KGs. On the other hand, the inference procedure for prediction is an untrustworthy black box. This paper proposes a novel explainable model for sparse KGC, compositing high-order reasoning into a graph convolutional network, namely HoGRN. It can not only improve the generalization ability to mitigate the information insufficiency issue but also provide interpretability while maintaining the model's effectiveness and efficiency. There are two main components that are seamlessly integrated for joint optimization. First, the high-order reasoning component learns high-quality relation representations by capturing endogenous correlation among relations. This can reflect logical rules to justify a broader of missing facts. Second, the entity updating component leverages a weight-free Graph Convolutional Network (GCN) to efficiently model KG structures with interpretability. Unlike conventional methods, we conduct entity aggregation and design composition-based attention in the relational space without additional parameters. The lightweight design makes HoGRN better suitable for sparse settings. For evaluation, we have conducted extensive experiments-the results of HoGRN on several sparse KGs present impressive improvements (9% MRR gain on average). Further ablation and case studies demonstrate the effectiveness of the main components. Our codes will be released upon acceptance.


The DLCC Node Classification Benchmark for Analyzing Knowledge Graph Embeddings

arXiv.org Artificial Intelligence

Knowledge graph embedding is a representation learning technique that projects entities and relations in a knowledge graph to continuous vector spaces. Embeddings have gained a lot of uptake and have been heavily used in link prediction and other downstream prediction tasks. Most approaches are evaluated on a single task or a single group of tasks to determine their overall performance. The evaluation is then assessed in terms of how well the embedding approach performs on the task at hand. Still, it is hardly evaluated (and often not even deeply understood) what information the embedding approaches are actually learning to represent. To fill this gap, we present the DLCC (Description Logic Class Constructors) benchmark, a resource to analyze embedding approaches in terms of which kinds of classes they can represent. Two gold standards are presented, one based on the real-world knowledge graph DBpedia and one synthetic gold standard. In addition, an evaluation framework is provided that implements an experiment protocol so that researchers can directly use the gold standard. To demonstrate the use of DLCC, we compare multiple embedding approaches using the gold standards. We find that many DL constructors on DBpedia are actually learned by recognizing different correlated patterns than those defined in the gold standard and that specific DL constructors, such as cardinality constraints, are particularly hard to be learned for most embedding approaches.


LiveSchema: A Gateway Towards Learning on Knowledge Graph Schemas

arXiv.org Artificial Intelligence

One of the major barriers to the training of algorithms on knowledge graph schemas, such as vocabularies or ontologies, is the difficulty that scientists have in finding the best input resource to address the target prediction tasks. In addition to this, a key challenge is to determine how to manipulate (and embed) these data, which are often in the form of particular triples (i.e., subject, predicate, object), to enable the learning process. In this paper, we describe the LiveSchema initiative, namely a gateway that offers a family of services to easily access, analyze, transform and exploit knowledge graph schemas, with the main goal of facilitating the reuse of these resources in machine learning use cases. As an early implementation of the initiative, we also advance an online catalog, which relies on more than 800 resources, with the first set of example services.


CompoundE: Knowledge Graph Embedding with Translation, Rotation and Scaling Compound Operations

arXiv.org Artificial Intelligence

Translation, rotation, and scaling are three commonly used geometric manipulation operations in image processing. Besides, some of them are successfully used in developing effective knowledge graph embedding (KGE) models such as TransE and RotatE. Inspired by the synergy, we propose a new KGE model by leveraging all three operations in this work. Since translation, rotation, and scaling operations are cascaded to form a compound one, the new model is named CompoundE. By casting CompoundE in the framework of group theory, we show that quite a few scoring-function-based KGE models are special cases of CompoundE. CompoundE extends the simple distance-based relation to relation-dependent compound operations on head and/or tail entities. To demonstrate the effectiveness of CompoundE, we conduct experiments on three popular KG completion datasets. Experimental results show that CompoundE consistently achieves the state of-the-art performance.


Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings

arXiv.org Artificial Intelligence

Knowledge graph embedding (KGE) models are an effective and popular approach to represent and reason with multi-relational data. Prior studies have shown that KGE models are sensitive to hyperparameter settings, however, and that suitable choices are dataset-dependent. In this paper, we explore hyperparameter optimization (HPO) for very large knowledge graphs, where the cost of evaluating individual hyperparameter configurations is excessive. Prior studies often avoided this cost by using various heuristics; e.g., by training on a subgraph or by using fewer epochs. We systematically discuss and evaluate the quality and cost savings of such heuristics and other low-cost approximation techniques. Based on our findings, we introduce GraSH, an efficient multi-fidelity HPO algorithm for large-scale KGEs that combines both graph and epoch reduction techniques and runs in multiple rounds of increasing fidelities. We conducted an experimental study and found that GraSH obtains state-of-the-art results on large graphs at a low cost (three complete training runs in total).


The Extensibility of Knowledge Graphs for Natural Language Understanding

#artificialintelligence

The universal applicability of enterprise knowledge--across use cases, domains, and languages--is widely understood. And, it's likely the main reason adoption rates for knowledge graphs have steadily inclined of late, making them one of the most utilitarian forms of AI available today. True knowledge graphs are extensible and predicated on standards designed to share data of any type. Such graphs are inherently composable, enabling users to either combine them or enrich them with knowledge of all sorts. These options are critical for not only simplifying the management of enterprise knowledge for Natural Language Understanding deployments, but also for redoubling the value organizations reap from knowledge graphs across a burgeoning array of use cases.


Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

arXiv.org Artificial Intelligence

Negative sampling (NS) loss plays an important role in learning knowledge graph embedding (KGE) to handle a huge number of entities. However, the performance of KGE degrades without hyperparameters such as the margin term and number of negative samples in NS loss being appropriately selected. Currently, empirical hyperparameter tuning addresses this problem at the cost of computational time. To solve this problem, we theoretically analyzed NS loss to assist hyperparameter tuning and understand the better use of the NS loss in KGE learning. Our theoretical analysis showed that scoring methods with restricted value ranges, such as TransE and RotatE, require appropriate adjustment of the margin term or the number of negative samples different from those without restricted value ranges, such as RESCAL, ComplEx, and DistMult. We also propose subsampling methods specialized for the NS loss in KGE studied from a theoretical aspect. Our empirical analysis on the FB15k-237, WN18RR, and YAGO3-10 datasets showed that the results of actually trained models agree with our theoretical findings.


Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion

arXiv.org Artificial Intelligence

Most real-world knowledge graphs (KG) are far from complete and comprehensive. This problem has motivated efforts in predicting the most plausible missing facts to complete a given KG, i.e., knowledge graph completion (KGC). However, existing KGC methods suffer from two main issues, 1) the false negative issue, i.e., the sampled negative training instances may include potential true facts; and 2) the data sparsity issue, i.e., true facts account for only a tiny part of all possible facts. To this end, we propose positive-unlabeled learning with adversarial data augmentation (PUDA) for KGC. In particular, PUDA tailors positive-unlabeled risk estimator for the KGC task to deal with the false negative issue. Furthermore, to address the data sparsity issue, PUDA achieves a data augmentation strategy by unifying adversarial training and positive-unlabeled learning under the positive-unlabeled minimax game. Extensive experimental results on real-world benchmark datasets demonstrate the effectiveness and compatibility of our proposed method.


A Double-Graph Based Framework for Frame Semantic Parsing

arXiv.org Artificial Intelligence

Frame semantic parsing is a fundamental NLP task, which consists of three subtasks: frame identification, argument identification and role classification. Most previous studies tend to neglect relations between different subtasks and arguments and pay little attention to ontological frame knowledge defined in FrameNet. In this paper, we propose a Knowledge-guided Incremental semantic parser with Double-graph (KID). We first introduce Frame Knowledge Graph (FKG), a heterogeneous graph containing both frames and FEs (Frame Elements) built on the frame knowledge so that we can derive knowledge-enhanced representations for frames and FEs. Besides, we propose Frame Semantic Graph (FSG) to represent frame semantic structures extracted from the text with graph structures. In this way, we can transform frame semantic parsing into an incremental graph construction problem to strengthen interactions between subtasks and relations between arguments. Our experiments show that KID outperforms the previous state-of-the-art method by up to 1.7 F1-score on two FrameNet datasets. Our code is availavle at https://github.com/PKUnlp-icler/KID.


The Foundation of Data Fabrics and AI: Semantic Knowledge Graphs - DataScienceCentral.com

#artificialintelligence

Data management agility has become of key importance to organizations as the amount and complexity of data continues to increase, along with the desire to avoid creating new data silos. The concept of creating a'data fabric' as an agile design concept has been proposed by leading analysts, such as Mark Beyer, Distinguished VP Analyst at Gartner. "The emerging design concept called'data fabric' can be a robust solution to ever present-day management challenges, such as the high-cost and low-value of data integration cycles, frequent maintenance of earlier integrations, the rising demand for real-time and event-driven data sharing, and more," says Mark Beyer. As a data fabric readily connects and provides singular access to all data sources distributed throughout the enterprise, semantic knowledge graphs provide the foundation that makes this design possible. Semantic knowledge graphs and aspects of AI are necessary for the data fabric architecture to work.