Goto

Collaborating Authors

 Semantic Networks


Information Prediction using Knowledge Graphs for Contextual Malware Threat Intelligence

arXiv.org Artificial Intelligence

Large amounts of threat intelligence information about mal-ware attacks are available in disparate, typically unstructured, formats. Knowledge graphs can capture this information and its context using RDF triples represented by entities and relations. Sparse or inaccurate threat information, however, leads to challenges such as incomplete or erroneous triples. Named entity recognition (NER) and relation extraction (RE) models used to populate the knowledge graph cannot fully guaran-tee accurate information retrieval, further exacerbating this problem. This paper proposes an end-to-end approach to generate a Malware Knowledge Graph called MalKG, the first open-source automated knowledge graph for malware threat intelligence. MalKG dataset called MT40K1 contains approximately 40,000 triples generated from 27,354 unique entities and 34 relations. We demonstrate the application of MalKGin predicting missing malware threat intelligence information in the knowledge graph. For ground truth, we manually curate a knowledge graph called MT3K, with 3,027 triples generated from 5,741 unique entities and 22 relations. For entity prediction via a state-of-the-art entity prediction model(TuckER), our approach achieves 80.4 for the hits@10 metric (predicts the top 10 options for missing entities in the knowledge graph), and 0.75 for the MRR (mean reciprocal rank). We also propose a framework to automate the extraction of thousands of entities and relations into RDF triples, both manually and automatically, at the sentence level from1,100 malware threat intelligence reports and from the com-mon vulnerabilities and exposures (CVE) database.


LineaRE: Simple but Powerful Knowledge Graph Embedding for Link Prediction

arXiv.org Artificial Intelligence

The task of link prediction for knowledge graphs is to predict missing relationships between entities. Knowledge graph embedding, which aims to represent entities and relations of a knowledge graph as low dimensional vectors in a continuous vector space, has achieved promising predictive performance. If an embedding model can cover different types of connectivity patterns and mapping properties of relations as many as possible, it will potentially bring more benefits for link prediction tasks. In this paper, we propose a novel embedding model, namely LineaRE, which is capable of modeling four connectivity patterns (i.e., symmetry, antisymmetry, inversion, and composition) and four mapping properties (i.e., one-to-one, one-to-many, many-to-one, and many-to-many) of relations. Specifically, we regard knowledge graph embedding as a simple linear regression task, where a relation is modeled as a linear function of two low-dimensional vector-presented entities with two weight vectors and a bias vector. Since the vectors are defined in a real number space and the scoring function of the model is linear, our model is simple and scalable to large knowledge graphs. Experimental results on multiple widely used real-world datasets show that the proposed LineaRE model significantly outperforms existing state-of-the-art models for link prediction tasks.


Knowledge Graph Embedding using Graph Convolutional Networks with Relation-Aware Attention

arXiv.org Artificial Intelligence

Knowledge graph embedding methods learn embeddings of entities and relations in a low dimensional space which can be used for various downstream machine learning tasks such as link prediction and entity matching. Various graph convolutional network methods have been proposed which use different types of information to learn the features of entities and relations. However, these methods assign the same weight (importance) to the neighbors when aggregating the information, ignoring the role of different relations with the neighboring entities. To this end, we propose a relation-aware graph attention model that leverages relation information to compute different weights to the neighboring nodes for learning embeddings of entities and relations. We evaluate our proposed approach on link prediction and entity matching tasks. Our experimental results on link prediction on three datasets (one proprietary and two public) and results on unsupervised entity matching on one proprietary dataset demonstrate the effectiveness of the relation-aware attention.


Learning Intents behind Interactions with Knowledge Graph for Recommendation

arXiv.org Artificial Intelligence

Knowledge graph (KG) plays an increasingly important role in recommender systems. A recent technical trend is to develop end-to-end models founded on graph neural networks (GNNs). However, existing GNN-based models are coarse-grained in relational modeling, failing to (1) identify user-item relation at a fine-grained level of intents, and (2) exploit relation dependencies to preserve the semantics of long-range connectivity. In this study, we explore intents behind a user-item interaction by using auxiliary item knowledge, and propose a new model, Knowledge Graph-based Intent Network (KGIN). Technically, we model each intent as an attentive combination of KG relations, encouraging the independence of different intents for better model capability and interpretability. Furthermore, we devise a new information aggregation scheme for GNN, which recursively integrates the relation sequences of long-range connectivity (i.e., relational paths). This scheme allows us to distill useful information about user intents and encode them into the representations of users and items. Experimental results on three benchmark datasets show that, KGIN achieves significant improvements over the state-of-the-art methods like KGAT, KGNN-LS, and CKAN. Further analyses show that KGIN offers interpretable explanations for predictions by identifying influential intents and relational paths. The implementations are available at https://github.com/huangtinglin/Knowledge_Graph_based_Intent_Network.


Exploring the Limits of Few-Shot Link Prediction in Knowledge Graphs

arXiv.org Artificial Intelligence

Real-world knowledge graphs are often characterized by low-frequency relations - a challenge that has prompted an increasing interest in few-shot link prediction methods. These methods perform link prediction for a set of new relations, unseen during training, given only a few example facts of each relation at test time. In this work, we perform a systematic study on a spectrum of models derived by generalizing the current state of the art for few-shot link prediction, with the goal of probing the limits of learning in this few-shot setting. We find that a simple zero-shot baseline - which ignores any relation-specific information - achieves surprisingly strong performance. Moreover, experiments on carefully crafted synthetic datasets show that having only a few examples of a relation fundamentally limits models from using fine-grained structural information and only allows for exploiting the coarse-grained positional information of entities. Together, our findings challenge the implicit assumptions and inductive biases of prior work and highlight new directions for research in this area.


Combat Data Shift in Few-shot Learning with Knowledge Graph

arXiv.org Artificial Intelligence

Many few-shot learning approaches have been designed under the meta-learning framework, which learns from a variety of learning tasks and generalizes to new tasks. These meta-learning approaches achieve the expected performance in the scenario where all samples are drawn from the same distributions (i.i.d. observations). However, in real-world applications, few-shot learning paradigm often suffers from data shift, i.e., samples in different tasks, even in the same task, could be drawn from various data distributions. Most existing few-shot learning approaches are not designed with the consideration of data shift, and thus show downgraded performance when data distribution shifts. However, it is non-trivial to address the data shift problem in few-shot learning, due to the limited number of labeled samples in each task. Targeting at addressing this problem, we propose a novel metric-based meta-learning framework to extract task-specific representations and task-shared representations with the help of knowledge graph. The data shift within/between tasks can thus be combated by the combination of task-shared and task-specific representations. The proposed model is evaluated on popular benchmarks and two constructed new challenging datasets. The evaluation results demonstrate its remarkable performance.


xERTE: Explainable Reasoning on Temporal Knowledge Graphs for Forecasting Future Links

arXiv.org Artificial Intelligence

Interest has been rising lately towards modeling time-evolving knowledge graphs (KGs). Recently, graph representation learning approaches have become the dominant paradigm for link prediction on temporal KGs. However, the embeddingbased approaches largely operate in a black-box fashion, lacking the ability to judge the results' reliability. This paper provides a future link forecasting framework that reasons over query-relevant subgraphs of temporal KGs and jointly models the graph structures and the temporal context information. Especially, we propose a temporal relational attention mechanism and a novel reverse representation update scheme to guide the extraction of an enclosing subgraph around the query. The subgraph is expanded by an iterative sampling of temporal neighbors and attention propagation. As a result, our approach provides humanunderstandable arguments for the prediction. We evaluate our model on four benchmark temporal knowledge graphs for the link forecasting task. While being more explainable, our model also obtains a relative improvement of up to 17.7 % on MRR compared to the previous best KG forecasting methods. We also conduct a survey with 53 respondents, and the results show that the reasoning arguments extracted by the model for link forecasting are aligned with human understanding. Reasoning, a process of inferring new knowledge from available facts, has long been considered to be an essential subject in artificial intelligence (AI). Recently, the KGaugmented reasoning process has been studied in (Das et al., 2017; Ren et al., 2020), where knowledge graphs store factual information in form of triples (s, p, o), e.g. In particular, s (subject) and o (object) are expressed as nodes in knowledge graphs and p (predicate) as an edge type. Most knowledge graph models assume that the underlying graph is static. However, in the real world, facts and knowledge change with time, which can be treated as time-dependent multi-relational data. To accommodate time-evolving multi-relational data, temporal KGs have been introduced (Boschee et al., 2015), where temporal events are represented as a quadruple by extending the static triplet with timestamps describing when these events occurred, i.e. (Barack Obama, inaugurated, as president of the US, 2009/01/20).


Mining Knowledge Graphs From Incident Reports

arXiv.org Artificial Intelligence

Incident management is a critical part of the DevOps processes for developing and operating large-scale services in the cloud. Incident reports filed by customers are largely unstructured making any automated diagnosis or mitigation non-trivial. It requires on-call engineers to parse verbose reports to understand the issue and locate key information. Prior work has looked into extraction of key attributes or entities like error codes, tenant Ids, stack traces, etc. from incident and bug reports. Although a flat list of entities is informative, to unlock the full potential of knowledge extraction, it is necessary to provide context to these entities. For instance, the relations between the real-world concepts or objects that these entities represent in otherwise unstructured data is useful for downstream tasks like incident linking, triaging and mitigation. With this additional context, entities are transformed from "Strings" to "Things". In this work, we present an approach to mine and score binary entity relations from co-occurring entity pairs. We evaluate binary relations extracted and show that our approach has a high precision of 0.9. Further, we construct knowledge graphs automatically and show that the implicit knowledge in the graph can be used to mine and rank relevant entities for distinct incidents, by mapping entities to clusters of incident titles.


Temporal Knowledge Graph Forecasting with Neural ODE

arXiv.org Artificial Intelligence

Learning node representation on dynamically-evolving, multi-relational graph data has gained great research interest. However, most of the existing models for temporal knowledge graph forecasting use Recurrent Neural Network (RNN) with discrete depth to capture temporal information, while time is a continuous variable. Inspired by Neural Ordinary Differential Equation (NODE), we extend the idea of continuum-depth models to time-evolving multi-relational graph data, and propose a novel Temporal Knowledge Graph Forecasting model with NODE. Our model captures temporal information through NODE and structural information through a Graph Neural Network (GNN). Thus, our graph ODE model achieves a continuous model in time and efficiently learns node representation for future prediction. We evaluate our model on six temporal knowledge graph datasets by performing link forecasting. Experiment results show the superiority of our model.


M\"{o}biusE: Knowledge Graph Embedding on M\"{o}bius Ring

arXiv.org Artificial Intelligence

In this work, we propose a novel Knowledge Graph Embedding (KGE) strategy, called M\"{o}biusE, in which the entities and relations are embedded to the surface of a M\"{o}bius ring. The proposition of such a strategy is inspired by the classic TorusE, in which the addition of two arbitrary elements is subject to a modulus operation. In this sense, TorusE naturally guarantees the critical boundedness of embedding vectors in KGE. However, the nonlinear property of addition operation on Torus ring is uniquely derived by the modulus operation, which in some extent restricts the expressiveness of TorusE. As a further generalization of TorusE, M\"{o}biusE also uses modulus operation to preserve the closeness of addition operation on it, but the coordinates on M\"{o}bius ring interacts with each other in the following way: {\em \color{red} any vector on the surface of a M\"{o}bius ring moves along its parametric trace will goes to the right opposite direction after a cycle}. Hence, M\"{o}biusE assumes much more nonlinear representativeness than that of TorusE, and in turn it generates much more precise embedding results. In our experiments, M\"{o}biusE outperforms TorusE and other classic embedding strategies in several key indicators.