Goto

Collaborating Authors

 Semantic Networks


Semantic AI: Bringing Machine Learning and Knowledge Graphs Together

#artificialintelligence

Hybrid Computing, and thus Hybrid Analytics are concepts which are undergoing accelerated mutations, with the introduction of Edge and Fog Computing, in the wake of new mobility and IoT communication protocols, technologies and practices being phased in the Industry on a daily basis, 5G being its latest illustration. Our objective will be to shed some light on the various impacts, both positive and challenging, that these transformations impose on Cloud Analytics. This session will first address what these changes spell out for Cloud Analytics and in particular, what are the new considerations, key assets and enabling paradigms being introduced, both in terms of functional architectures and underlying infrastructures supporting the ingestion, distributed treatment and produced insights, in the cloud, in the fog, and at the edge, along with the unlocked potentials but also the pitfalls associated to them. As a part in these considerations, the session will address the intrinsic security, information privacy and data protection concerns, and the specific hybrid specificities which allow for new ways to compartment privacy and protect anonymity while maintaining the same descriptive and predictive capabilities. Unfortunately, we'll see that these new hybrid architectures can also harbor new combinations of vulnerabilities.


Tackling scalability issues in mining path patterns from knowledge graphs: a preliminary study

arXiv.org Artificial Intelligence

Features mined from knowledge graphs are widely used within multiple knowledge discovery tasks such as classification or fact-checking. Here, we consider a given set of vertices, called seed vertices, and focus on mining their associated neighboring vertices, paths, and, more generally, path patterns that involve classes of ontologies linked with knowledge graphs. Due to the combinatorial nature and the increasing size of real-world knowledge graphs, the task of mining these patterns immediately entails scalability issues. In this paper, we address these issues by proposing a pattern mining approach that relies on a set of constraints (e.g., support or degree thresholds) and the monotonicity property. As our motivation comes from the mining of real-world knowledge graphs, we illustrate our approach with PGxLOD, a biomedical knowledge graph.


Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework

arXiv.org Artificial Intelligence

The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. In order to assess the reproducibility of previously published results, we re-implemented and evaluated 19 interaction models in the PyKEEN software package. Here, we outline which results could be reproduced with their reported hyper-parameters, which could only be reproduced with alternate hyper-parameters, and which could not be reproduced at all as well as provide insight as to why this might be the case. We then performed a large-scale benchmarking on four datasets with several thousands of experiments and 21,246 GPU hours of computation time. We present insights gained as to best practices, best configurations for each model, and where improvements could be made over previously published best configurations. Our results highlight that the combination of model architecture, training approach, loss function, and the explicit modeling of inverse relations is crucial for a model's performances, and not only determined by the model architecture. We provide evidence that several architectures can obtain results competitive to the state-of-the-art when configured carefully. We have made all code, experimental configurations, results, and analyses that lead to our interpretations available at https://github.com/pykeen/pykeen and https://github.com/pykeen/benchmarking


PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings

arXiv.org Artificial Intelligence

Recently, knowledge graph embeddings (KGEs) received significant attention, and several software libraries have been developed for training and evaluating KGEs. While each of them addresses specific needs, we re-designed and re-implemented PyKEEN, one of the first KGE libraries, in a community effort. PyKEEN 1.0 enables users to compose knowledge graph embedding models (KGEMs) based on a wide range of interaction models, training approaches, loss functions, and permits the explicit modeling of inverse relations. Besides, an automatic memory optimization has been realized in order to exploit the provided hardware optimally, and through the integration of Optuna extensive hyper-parameter optimization (HPO) functionalities are provided.


A Survey on Graph Neural Networks for Knowledge Graph Completion

arXiv.org Artificial Intelligence

Knowledge Graphs are increasingly becoming popular for a variety of downstream tasks like Question Answering and Information Retrieval. However, the Knowledge Graphs are often incomplete, thus leading to poor performance. As a result, there has been a lot of interest in the task of Knowledge Base Completion. More recently, Graph Neural Networks have been used to capture structural information inherently stored in these Knowledge Graphs and have been shown to achieve SOTA performance across a variety of datasets. In this survey, we understand the various strengths and weaknesses of the proposed methodology and try to find new exciting research problems in this area that require further investigation.


KGCNs: Machine Learning over Knowledge Graphs with TensorFlow

#artificialintelligence

This project introduces a novel model: the Knowledge Graph Convolutional Network (KGCN), available free to use from the GitHub repo under Apache licensing. It's written in Python, and available to install via pip from PyPi. The principal idea of this work is to forge a bridge between knowledge graphs, automated logical reasoning, and machine learning, using Grakn as the knowledge graph. A KGCN can be used to create vector representations, embeddings, of any labelled set of Grakn Things via supervised learning. There are many benefits to storing complex and interrelated data in a knowledge graph, not least that the context of each datapoint can be stored in full.


Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

arXiv.org Machine Learning

Knowledge graph embedding, which aims to learn the low-dimensional representations of entities and relationships, has attracted considerable research efforts recently. However, most knowledge graph embedding methods focus on the structural relationships in fixed triples while ignoring the temporal information. Currently, existing time-aware graph embedding methods only focus on the factual plausibility, while ignoring the temporal smoothness which models the interactions between a fact and its contexts, and thus can capture fine-granularity temporal relationships. This leads to the limited performance of embedding related applications. To solve this problem, this paper presents a Robustly Time-aware Graph Embedding (RTGE) method by incorporating temporal smoothness. Two major innovations of our paper are presented here. At first, RTGE integrates a measure of temporal smoothness in the learning process of the time-aware graph embedding. Via the proposed additional smoothing factor, RTGE can preserve both structural information and evolutionary patterns of a given graph. Secondly, RTGE provides a general task-oriented negative sampling strategy associated with temporally-aware information, which further improves the adaptive ability of the proposed algorithm and plays an essential role in obtaining superior performance in various tasks. Extensive experiments conducted on multiple benchmark tasks show that RTGE can increase performance in entity/relationship/temporal scoping prediction tasks.


COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation

arXiv.org Artificial Intelligence

To combat COVID-19, both clinicians and scientists need to digest the vast amount of relevant biomedical knowledge in literature to understand the disease mechanism and the related biological functions. We have developed a novel and comprehensive knowledge discovery framework, \textbf{COVID-KG} to extract fine-grained multimedia knowledge elements (entities, relations and events) from scientific literature. We then exploit the constructed multimedia knowledge graphs (KGs) for question answering and report generation, using drug repurposing as a case study. Our framework also provides detailed contextual sentences, subfigures and knowledge subgraphs as evidence. All of the data, KGs, reports, resources and shared services are publicly available.


Knowledge Graph Extraction from Videos

arXiv.org Artificial Intelligence

Nearly all existing techniques for automated video annotation (or captioning) describe videos using natural language sentences. However, this has several shortcomings: (i) it is very hard to then further use the generated natural language annotations in automated data processing, (ii) generating natural language annotations requires to solve the hard subtask of generating semantically precise and syntactically correct natural language sentences, which is actually unrelated to the task of video annotation, (iii) it is difficult to quantitatively measure performance, as standard metrics (e.g., accuracy and F1-score) are inapplicable, and (iv) annotations are language-specific. In this paper, we propose the new task of knowledge graph extraction from videos, i.e., producing a description in the form of a knowledge graph of the contents of a given video. Since no datasets exist for this task, we also include a method to automatically generate them, starting from datasets where videos are annotated with natural language. We then describe an initial deep-learning model for knowledge graph extraction from videos, and report results on MSVD* and MSR-VTT*, two datasets obtained from MSVD and MSR-VTT using our method.


Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs

arXiv.org Machine Learning

We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest social coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events.