Semantic Networks
Knowledge Graphs and Machine Learning in biased C4I applications
Paparidis, Evangelos, Kotis, Konstantinos
This paper introduces our position on the critical issue of bias that recently appeared in AI applications. Specifically, we discuss the combination of current technologies used in AI applications i.e., Machine Learning and Knowledge Graphs, and point to their involvement in (de)biased applications of the C4I domain. Although this is a wider problem that currently emerges from different application domains, bias appears more critical in C4I than in others due to its security-related nature. While proposing certain actions to be taken towards debiasing C4I applications, we acknowledge the immature aspect of this topic within the Knowledge Graph and Semantic Web communities.
Query Embedding on Hyper-relational Knowledge Graphs
Alivanistos, Dimitrios, Berrendorf, Max, Cochez, Michael, Galkin, Mikhail
Multi-hop logical reasoning is an established problem in the field of representation learning on knowledge graphs (KGs). It subsumes both one-hop link prediction as well as other more complex types of logical queries. Existing algorithms operate only on classical, triple-based graphs, whereas modern KGs often employ a hyper-relational modeling paradigm. In this paradigm, typed edges may have several key-value pairs known as qualifiers that provide fine-grained context for facts. In queries, this context modifies the meaning of relations, and usually reduces the answer set. Hyper-relational queries are often observed in real-world KG applications, and existing approaches for approximate query answering cannot make use of qualifier pairs. In this work, we bridge this gap and extend the multi-hop reasoning problem to hyper-relational KGs allowing to tackle this new type of complex queries. Building upon recent advancements in Graph Neural Networks and query embedding techniques, we study how to embed and answer hyper-relational conjunctive queries. Besides that, we propose a method to answer such queries and demonstrate in our experiments that qualifiers improve query answering on a diverse set of query patterns.
An Intelligent Question Answering System based on Power Knowledge Graph
Tang, Yachen, Han, Haiyun, Yu, Xianmao, Zhao, Jing, Liu, Guangyi, Wei, Longfei
The intelligent question answering (IQA) system can accurately capture users' search intention by understanding the natural language questions, searching relevant content efficiently from a massive knowledge-base, and returning the answer directly to the user. Since the IQA system can save inestimable time and workforce in data search and reasoning, it has received more and more attention in data science and artificial intelligence. This article introduced a domain knowledge graph using the graph database and graph computing technologies from massive heterogeneous data in electric power. It then proposed an IQA system based on the electrical power knowledge graph to extract the intent and constraints of natural interrogation based on the natural language processing (NLP) method, to construct graph data query statements via knowledge reasoning, and to complete the accurate knowledge search and analysis to provide users with an intuitive visualization. This method thoroughly combined knowledge graph and graph computing characteristics, realized high-speed multi-hop knowledge correlation reasoning analysis in tremendous knowledge. The proposed work can also provide a basis for the context-aware intelligent question and answer.
PRASEMap: A Probabilistic Reasoning and Semantic Embedding based Knowledge Graph Alignment System
Qi, Zhiyuan, Zhang, Ziheng, Chen, Jiaoyan, Chen, Xi, Zheng, Yefeng
Knowledge Graph (KG) alignment aims at finding equivalent entities and relations (i.e., mappings) between two KGs. The existing approaches utilize either reasoning-based or semantic embedding-based techniques, but few studies explore their combination. In this demonstration, we present PRASEMap, an unsupervised KG alignment system that iteratively computes the Mappings with both Probabilistic Reasoning (PR) And Semantic Embedding (SE) techniques. PRASEMap can support various embedding-based KG alignment approaches as the SE module, and enables easy human computer interaction that additionally provides an option for users to feed the mapping annotations back to the system for better results. The demonstration showcases these features via a stand-alone Web application with user friendly interfaces.
Node Classification Meets Link Prediction on Knowledge Graphs
Abboud, Ralph, Ceylan, İsmail İlkan
Node classification and link prediction are widely studied tasks in graph representation learning. While both transductive node classification and link prediction operate over a single input graph, they are studied in isolation so far, which leads to discrepancies. Node classification models take as input a graph with node features and incomplete node labels, and implicitly assume that the input graph is relationally complete, i.e., no edges are missing from the input graph. This is in sharp contrast with link prediction models that are solely motivated by the relational incompleteness of the input graph which does not have any node features. We propose a unifying perspective and study the problems of (i) transductive node classification over incomplete graphs and (ii) link prediction over graphs with node features. We propose an extension to an existing box embedding model, and show that this model is fully expressive, and can solve both of these tasks in an end-to-end fashion. To empirically evaluate our model, we construct a knowledge graph with node features, which is challenging both for node classification and link prediction. Our model performs very strongly when compared to the respective state-of-the-art models for node classification and link prediction on this dataset and shows the importance of a unified perspective for node classification and link prediction on knowledge graphs.
Unified Interpretation of Softmax Cross-Entropy and Negative Sampling: With Case Study for Knowledge Graph Embedding
Kamigaito, Hidetaka, Hayashi, Katsuhiko
In knowledge graph embedding, the theoretical relationship between the softmax cross-entropy and negative sampling loss functions has not been investigated. This makes it difficult to fairly compare the results of the two different loss functions. We attempted to solve this problem by using the Bregman divergence to provide a unified interpretation of the softmax cross-entropy and negative sampling loss functions. Under this interpretation, we can derive theoretical findings for fair comparison. Experimental results on the FB15k-237 and WN18RR datasets show that the theoretical findings are valid in practical settings.
Engineering Knowledge Graph from Patent Database
Siddharth, L, Blessing, Lucienne T. M., Wood, Kristin L., Luo, Jianxi
We propose a large, scalable engineering knowledge graph, comprising sets of (entity, relationship, entity) triples that are real-world engineering facts found in the patent database. We apply a set of rules based on the syntactic and lexical properties of claims in a patent document to extract facts. We aggregate these facts within each patent document and integrate the aggregated sets of facts across the patent database to obtain the engineering knowledge graph. Such a knowledge graph is expected to support inference, reasoning, and recalling in various engineering tasks. The knowledge graph has a greater size and coverage in comparison with the previously used knowledge graphs and semantic networks in the engineering literature.
Understanding the Performance of Knowledge Graph Embeddings in Drug Discovery
Bonner, Stephen, Barrett, Ian P, Ye, Cheng, Swiers, Rowan, Engkvist, Ola, Hoyt, Charles Tapley, Hamilton, William L
Knowledge Graphs (KG) and associated Knowledge Graph Embedding (KGE) models have recently begun to be explored in the context of drug discovery and have the potential to assist in key challenges such as target identification. In the drug discovery domain, KGs can be employed as part of a process which can result in lab-based experiments being performed, or impact on other decisions, incurring significant time and financial costs and most importantly, ultimately influencing patient healthcare. For KGE models to have impact in this domain, a better understanding of not only of performance, but also the various factors which determine it, is required. In this study we investigate, over the course of many thousands of experiments, the predictive performance of five KGE models on two public drug discovery-oriented KGs. Our goal is not to focus on the best overall model or configuration, instead we take a deeper look at how performance can be affected by changes in the training setup, choice of hyperparameters, model parameter initialisation seed and different splits of the datasets. Our results highlight that these factors have significant impact on performance and can even affect the ranking of models. Indeed these factors should be reported along with model architectures to ensure complete reproducibility and fair comparisons of future work, and we argue this is critical for the acceptance of use, and impact of KGEs in a biomedical setting. To aid reproducibility of our own work, we release all experimentation code.
An evaluation of template and ML-based generation of user-readable text from a knowledge graph
Mahlaza, Zola, Keet, C. Maria, Dunn, Jarryd, Poulter, Matthew
Typical user-friendly renderings of knowledge graphs are visualisations and natural language text. Within the latter HCI solution approach, data-driven natural language generation systems receive increased attention, but they are often outperformed by template-based systems due to suffering from errors such as content dropping, hallucination, or repetition. It is unknown which of those errors are associated significantly with low quality judgements by humans who the text is aimed for, which hampers addressing errors based on their impact on improving human evaluations. We assessed their possible association with an experiment availing of expert and crowdsourced evaluations of human authored text, template generated text, and sequence-to-sequence model generated text. The results showed that there was no significant association between human authored texts with errors and the low human judgements of naturalness and quality. There was also no significant association between machine learning generated texts with dropped or hallucinated slots and the low human judgements of naturalness and quality. Thus, both approaches appear to be viable options for designing a natural language interface for knowledge graphs.
Knowledge Graphs 2.0: High Performance Computing Emerges - insideBIGDATA
They're the most effective means of preparing data for statistical AI, creditable knowledge graph platforms utilize supervised and unsupervised learning to accelerate numerous processes, and their smart inferences are a form of machine intelligence. Coupling knowledge graphs with high performance computing enables organizations to not only avail themselves of sophisticated techniques to optimize AI, but also employ it at the scale and speed of contemporary data demands. According to Katana Graph CEO Keshav Pingali, "There is a need for high performance graph computing…in two ways. One is the volume of data, and the other is time to insight." Scaling knowledge graphs with high performance computing is a means of rapidly analyzing the tremendous data quantities organizations routinely contend with for informed, low latent action across numerous use cases including "intrusion detection, fraud detection, and Anti-Money Laundering," Pingali noted.