Goto

Collaborating Authors

 Semantic Networks


Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph

arXiv.org Machine Learning

Increasingly large electronic health records (EHRs) provide an opportunity to algorithmically learn medical knowledge. In one prominent example, a causal health knowledge graph could learn relationships between diseases and symptoms and then serve as a diagnostic tool to be refined with additional clinical input. Prior research has demonstrated the ability to construct such a graph from over 270,000 emergency department patient visits. In this work, we describe methods to evaluate a health knowledge graph for robustness. Moving beyond precision and recall, we analyze for which diseases and for which patients the graph is most accurate. We identify sample size and unmeasured confounders as major sources of error in the health knowledge graph. We introduce a method to leverage non-linear functions in building the causal graph to better understand existing model assumptions. Finally, to assess model generalizability, we extend to a larger set of complete patient visits within a hospital system. We conclude with a discussion on how to robustly extract medical knowledge from EHRs.


MTab: Matching Tabular Data to Knowledge Graph using Probability Models

arXiv.org Artificial Intelligence

This paper presents the design of our system, namely MTab, for Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab 2019). MTab combines the voting algorithm and the probability models to solve critical problems of the matching tasks.


Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

arXiv.org Artificial Intelligence

Incompleteness is a common problem for existing knowledge graphs (KGs), and the completion of KG which aims to predict links between entities is challenging. Most existing KG completion methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for link prediction. Recently, a few methods take relation paths into consideration but pay less attention to the order of relations in paths which is important for reasoning. In addition, these path-based models always ignore nonlinear contributions of path features for link prediction. To solve these problems, we propose a novel KG completion method named OPTransE. Instead of embedding both entities of a relation into the same latent space as in previous methods, we project the head entity and the tail entity of each relation into different spaces to guarantee the order of relations in the path. Meanwhile, we adopt a pooling strategy to extract nonlinear and complex features of different paths to further improve the performance of link prediction. Experimental results on two benchmark datasets show that the proposed model OPTransE performs better than state-of-the-art methods.


On Understanding Knowledge Graph Representation

arXiv.org Machine Learning

Many methods have been developed to represent knowledge graph data, which implicitly exploit low-rank latent structure in the data to encode known information and enable unknown facts to be inferred. To predict whether a relationship holds between entities, their embeddings are typically compared in the latent space following a relation-specific mapping. Whilst link prediction has steadily improved, the latent structure, and hence why such models capture semantic information, remains unexplained. We build on recent theoretical interpretation of word embeddings as a basis to consider an explicit structure for representations of relations between entities. For identifiable relation types, we are able to predict properties and justify the relative performance of leading knowledge graph representation methods, including their often overlooked ability to make independent predictions.


r/MachineLearning - [R] Enriching BERT with Knowledge Graph Embeddings for Document Classification

#artificialintelligence

In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata. Building upon BERT, a deep neural language model, we demonstrate how to combine text representations with metadata and knowledge graph embeddings, which encode author information. Compared to the standard BERT approach we achieve considerably better results for the classification task. For a more coarse-grained classification using eight labels we achieve an F1- score of 87.20, while a detailed classification using 343 labels yields an F1-score of 64.70. We make the source code and trained models of our experiments publicly available.


Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space

arXiv.org Artificial Intelligence

The trends of open science have enabled several open scholarly datasets which include millions of papers and authors. Managing, exploring, and utilizing such large and complicated datasets effectively are challenging. In recent years, the knowledge graph has emerged as a universal data format for representing knowledge about heterogeneous entities and their relationships. The knowledge graph can be modeled by knowledge graph embedding methods, which represent entities and relations as embedding vectors in semantic space, then model the interactions between these embedding vectors. However, the semantic structures in the knowledge graph embedding space are not well-studied, thus knowledge graph embedding methods are usually only used for knowledge graph completion but not data representation and analysis. In this paper, we propose to analyze these semantic structures based on the well-studied word embedding space and use them to support data exploration. We also define the semantic queries, which are algebraic operations between the embedding vectors in the knowledge graph embedding space, to solve queries such as similarity and analogy between the entities on the original datasets. We then design a general framework for data exploration by semantic queries and discuss the solution to some traditional scholarly data exploration tasks. We also propose some new interesting tasks that can be solved based on the uncanny semantic structures of the embedding space.


HapPenIng: Happen, Predict, Infer -- Event Series Completion in a Knowledge Graph

arXiv.org Artificial Intelligence

Event series, such as the Wimbledon Championships and the US presidential elections, represent important happenings in key societal areas including sports, culture and politics. However, semantic reference sources, such as Wikidata, DBpedia and EventKG knowledge graphs, provide only an incomplete event series representation. In this paper we target the problem of event series completion in a knowledge graph. We address two tasks: 1) prediction of sub-event relations, and 2) inference of real-world events that happened as a part of event series and are missing in the knowledge graph. To address these problems, our proposed supervised HapPenIng approach leverages structural features of event series. HapPenIng does not require any external knowledge - the characteristics making it unique in the context of event inference. Our experimental evaluation demonstrates that HapPenIng outperforms the baselines by 44 and 52 percentage points in terms of precision for the sub-event prediction and the inference tasks, correspondingly. 1 Introduction Event series, such as sports tournaments, music festivals and political elections are sequences of recurring events. Prominent examples include the Wimbledon Championships, the Summer Olympic Games, the United States presidential elections and the International Semantic Web Conference. The provision of reliable reference sources for event series is of crucial importance for many real-world applications, for example in the context of Digital Humanities and Web Science research [7, 9, 25], as well as media analytics and digital journalism [15, 23].


Group Representation Theory for Knowledge Graph Embedding

arXiv.org Artificial Intelligence

Knowledge graph embedding has recently become a popular way to model relations and infer missing links. In this paper, we present a group theoretical perspective of knowledge graph embedding, connecting previous methods with different group actions. Furthermore, by utilizing Schur's lemma from group representation theory, we show that the state of the art embedding method RotatE can model relations from any finite Abelian group.


KG-BERT: BERT for Knowledge Graph Completion

arXiv.org Artificial Intelligence

Knowledge graphs are important resources for many artificial intelligence tasks but often suffer from incompleteness. In this work, we propose to use pre-trained language models for knowledge graph completion. We treat triples in knowledge graphs as textual sequences and propose a novel framework named Knowledge Graph Bidirectional Encoder Representations from Transformer (KG-BERT) to model these triples. Our method takes entity and relation descriptions of a triple as input and computes scoring function of the triple with the KG-BERT language model. Experimental results on multiple benchmark knowledge graphs show that our method can achieve state-of-the-art performance in triple classification, link prediction and relation prediction tasks.


Combination of Unified Embedding Model and Observed Features for Knowledge Graph Completion

arXiv.org Artificial Intelligence

Combination of Unified Embedding Model and Observed Features for Knowledge Graph Completion T akuma Ebisu 1,2 and Ryutaro Ichise 2,1,3 1 SOKENDAI (The Graduate University for Advanced Studies) 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan 2 National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan 3 National Institute of Advanced Industrial Science and Technology 2-3-26 Aomi, Koto-ku, Tokyo, Japan {takuma,ichise}@nii.ac.jp Abstract Knowledge graphs are useful for many artificial intelligence tasks but often have missing data. Hence, a method for completing knowledge graphs is required. Existing approaches include embedding models, the Path Ranking Algorithm, and rule evaluation models. However, these approaches have limitations. For example, all the information is mixed and difficult to interpret in embedding models, and traditional rule evaluation models are basically slow. In this paper, we provide an integrated view of various approaches and combine them to compensate for their limitations. We first unify state-of-the-art embedding models, such as ComplEx and TorusE, reinterpreting them as a variant of translation-based models. Then, we show that these models utilize paths for link prediction and propose a method for evaluating rules based on this idea. Finally, we combine an embedding model and observed feature models to predict missing triples. This is possible because all of these models utilize paths. We also conduct experiments, including link prediction tasks, with standard datasets to evaluate our method and framework. The experiments show that our method can evaluate rules faster than traditional methods and that our framework outperforms state-of-the-art models in terms of link prediction. 1 Introduction Knowledge graphs are used to describe many types of real-world relations in a form that can be easily processed by a computer.