Semantic Networks
r/MachineLearning - [N] Pre-trained knowledge graph embedding models are available in GraphVite!
In the recent update of GraphVite, we release a new large-scale knowledge graph dataset, along with new benchmarks of knowledge graph embedding methods. The dataset, Wikidata5m, contains 5 million entities and 21 million facts constructed from Wikidata and Wikipedia. Most of the entities come from the general domain or the scientific domain, such as celebrities, events, concepts and things. To facilitate the usage of knowledge graph representations in semantic tasks, we provide a bunch of pre-trained embeddings from popular models, including TransE, DistMult, ComplEx, SimplE and RotatE. You can directly access these embeddings by natural language index, such as "machine learning", "united states" or even abbreviations like "m.i.t.".
Practical AI #65: Intelligent systems and knowledge graphs with James Fletcher, principal scientist at Grakn Labs
DigitalOcean โ The simplest cloud platform for developers and teams Whether you're running one virtual machine or ten thousand, makes managing your infrastructure too easy. Get started for free with a $50 credit. AI Demystified (FREE five-day mini-course) โ Get an introduction to the most important concepts, types, and business applications for AI and Machine Learning. This course is 100% free. The Brave Browser โ Browse the web up to 8x faster than Chrome and Safari, block ads and trackers by default, and reward your favorite creators with the built-in Basic Attention Token.
Temporal Knowledge Graph Embedding Model based on Additive Time Series Decomposition
Xu, Chengjin, Nayyeri, Mojtaba, Alkhoury, Fouad, Lehmann, Jens, Yazdi, Hamed Shariat
Knowledge Graph (KG) embedding has attracted more attention in recent years. Most of KG embedding models learn from time-unaware triples. However, the inclusion of temporal information beside triples would further improve the performance of a KGE model. In this regard, we propose A TiSE, a temporal KG embedding model which incorporates time information into entity/relation representations by using A dditive Time Se ries decomposition. Moreover, considering the temporal uncertainty during the evolution of entity/relation representations over time, we map the representations of temporal KGs into the space of multidimensional Gaussian distributions. The mean of each entity/relation embedding at a time step shows the current expected position, whereas its covariance (which is temporally stationary) represents its temporal uncertainty. Experimental results show that A TiSE not only achieves the state-of-the-art on link prediction over temporal KGs, but also can predict the occurrence time of facts with missing time annotations, as well as the existence of future events. To the best of our knowledge, no other model is capable to perform all these tasks.
Using Mapping Languages for Building Legal Knowledge Graphs from XML Files
Junior, Ademar Crotti, Orlandi, Fabrizio, O'Sullivan, Declan, Dirschl, Christian, Reul, Quentin
This paper presents our experience on building RDF knowledge graphs for an industrial use case in the legal domain. The information contained in legal information systems are often accessed through simple keyword interfaces and presented as a simple list of hits. In order to improve search accuracy one may avail of knowledge graphs, where the semantics of the data can be made explicit. Significant research effort has been invested in the area of building knowledge graphs from semi-structured text documents, such as XML, with the prevailing approach being the use of mapping languages. In this paper, we present a semantic model for representing legal documents together with an industrial use case. We also present a set of use case requirements based on the proposed semantic model, which are used to compare and discuss the use of state-of-the-art mapping languages for building knowledge graphs for legal data. Keywords: Mapping languages ยท Legal Knowledge Graphs ยท Legal semantic model 1 Introduction The body of law to which citizens and businesses have to adhere is constantly increasing in volume and complexity [2]. The information contained in such a body of law is usually provided by unstructured text within legal documents, for which a number of systems have been developed. The information made available by such legal information systems, however, is often accessed with simple, keyword-based search interfaces and presented as a simple list of hits [7].
InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions
Vashishth, Shikhar, Sanyal, Soumya, Nitin, Vikram, Agrawal, Nilesh, Talukdar, Partha
Most existing knowledge graphs suffer from incompleteness, which can be alleviated by inferring missing links based on known facts. One popular way to accomplish this is to generate low-dimensional embeddings of entities and relations, and use these to make inferences. ConvE, a recently proposed approach, applies convolutional filters on 2D reshapings of entity and relation embeddings in order to capture rich interactions between their components. However, the number of interactions that ConvE can capture is limited. In this paper, we analyze how increasing the number of these interactions affects link prediction performance, and utilize our observations to propose InteractE. InteractE is based on three key ideas -- feature permutation, a novel feature reshaping, and circular convolution. Through extensive experiments, we find that InteractE outperforms state-of-the-art convolutional link prediction baselines on FB15k-237. Further, InteractE achieves an MRR score that is 9%, 7.5%, and 23% better than ConvE on the FB15k-237, WN18RR and YAGO3-10 datasets respectively. The results validate our central hypothesis -- that increasing feature interaction is beneficial to link prediction performance. We make the source code of InteractE available to encourage reproducible research.
Inductive Relation Prediction on Knowledge Graphs
Teru, Komal K., Hamilton, William L.
Inferring missing edges in multi-relational knowledge graphs is a fundamental task in statistical relational learning. However, previous work has largely focused on the transductive relation prediction problem, where missing edges must be predicted for a single, fixed graph. In contrast, many real-world situations require relation prediction on dynamic or previously unseen knowledge graphs (e.g., for question answering, dialogue, or e-commerce applications). Here, we develop a novel graph neural network (GNN) architecture to perform inductive relation prediction and provide a systematic comparison between this GNN approach and a strong, rule-based baseline. Our results highlight the significant difficulty of inductive relational learning, compared to the transductive case, and offer a new challenging set of inductive benchmarks for knowledge graph completion.
Predicting microRNA-disease associations from knowledge graph using tensor decomposition with relational constraints
Huang, Feng, Xiong, Zhankun, Zhang, Guan, Yu, Zhouxin, Xu, Xinran, Zhang, Wen
Motivation: MiRNAs are a kind of small non - coding RNAs that are not translated into proteins, and aberrant expression of miRNAs is associated with human diseases. Since miRNAs have different roles in diseases, the miRNA - disease associations are categorized into multiple types according to their roles. Predicting miRNA - disease associations and types is critical to understand the underlying patho genesis of human diseases from the molecular level . Results: In this paper, we formulate the problem as a link prediction in knowledge graphs. We use biomedical knowledge bases to build a knowledge graph of entities representing miRNAs and disease and mult i - relations, and we propose a tensor decomposition - based model named TDRC to predict miRNA - disease associations and their types from the knowledge graph. We have experimentally evaluated our method and compared it to several baseline methods. The results d emonstrate that the proposed method h as high - accuracy and high - efficiency performances.
Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding
Tang, Yun, Huang, Jing, Wang, Guangtao, He, Xiaodong, Zhou, Bowen
Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE. However, N-1, 1-N and N-N predictions still remain challenging. In this work, we propose a novel translational distance-based approach for knowledge graph link prediction. The proposed method includes two-folds, first we extend the RotatE from 2D complex domain to high dimension space with orthogonal transforms to model relations for better modeling capacity. Second, the graph context is explicitly modeled via two directed context representations. These context representations are used as part of the distance scoring function to measure the plausibility of the triples during training and inference. The proposed approach effectively improves prediction accuracy on the difficult N-1, 1-N and N-N cases for knowledge graph link prediction task. The experimental results show that it achieves better performance on two benchmark data sets compared to the baseline RotatE, especially on data set (FB15k-237) with many high in-degree connection nodes.
Ruminating Word Representations with Random Noised Masker
Jo, Hwiyeol, Zhang, Byoung-Tak
We introduce a training method for both better word representation and performance, which we call GROVER (Gradual Rumination On the Vector with maskERs). The method is to gradually and iteratively add random noises to word embeddings while training a model. GROVER first starts from conventional training process, and then extracts the fine-tuned representations. Next, we gradually add random noises to the word representations and repeat the training process from scratch, but initialize with the noised word representations. Through the re-training process, we can mitigate some noises to be compensated and utilize other noises to learn better representations. As a result, we can get word representations further fine-tuned and specialized on the task. When we experiment with our method on 5 text classification datasets, our method improves model performances on most of the datasets. Moreover, we show that our method can be combined with other regularization techniques, further improving the model performance.
Knowledge Graphs & NLP @ EMNLP 2019 Part I
Language models (LMs) are the hottest topic in the NLP research right now. The most prominent examples are BERT and GPT-2 but new LMs are published every month trained on humongous volumes of text. Are LMs capable of encoding knowledge in a way similar to knowledge graphs? Petroni et al study this problem comparing language models with knowledge graphs on Question Answering and NLG tasks where factual knowledge is required, e.g., a question is posed by inserting a MASK token instead of an answer. Turns out LMs demonstrate similar to KGs performance on very simple questions such as "Adolphe Adam died in [Paris]" .