composition pattern
ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion
Pavlović, Aleksandar, Sallinger, Emanuel
Knowledge graphs are inherently incomplete. Therefore substantial research has been directed toward knowledge graph completion (KGC), i.e., predicting missing triples from the information represented in the knowledge graph (KG). KG embedding models (KGEs) have yielded promising results for KGC, yet any current KGE is incapable of: (1) fully capturing vital inference patterns (e.g., composition), (2) capturing prominent patterns jointly (e.g., hierarchy and composition), and (3) providing an intuitive interpretation of captured patterns. In this work, we propose ExpressivE, a fully expressive spatio-functional KGE that solves all these challenges simultaneously. ExpressivE embeds pairs of entities as points and relations as hyper-parallelograms in the virtual triple space $\mathbb{R}^{2d}$. This model design allows ExpressivE not only to capture a rich set of inference patterns jointly but additionally to display any supported inference pattern through the spatial relation of hyper-parallelograms, offering an intuitive and consistent geometric interpretation of ExpressivE embeddings and their captured patterns. Experimental results on standard KGC benchmarks reveal that ExpressivE is competitive with state-of-the-art KGEs and even significantly outperforms them on WN18RR.
Understanding the Complexity and Its Impact on Testing in ML-Enabled Systems
Cao, Junming, Chen, Bihuan, Hu, Longjie, Gao, Jie, Huang, Kaifeng, Peng, Xin
Machine learning (ML) enabled systems are emerging with recent breakthroughs in ML. A model-centric view is widely taken by the literature to focus only on the analysis of ML models. However, only a small body of work takes a system view that looks at how ML components work with the system and how they affect software engineering for MLenabled systems. In this paper, we adopt this system view, and conduct a case study on Rasa 3.0, an industrial dialogue system that has been widely adopted by various companies around the world. Our goal is to characterize the complexity of such a largescale ML-enabled system and to understand the impact of the complexity on testing. Our study reveals practical implications for software engineering for ML-enabled systems.
LineaRE: Simple but Powerful Knowledge Graph Embedding for Link Prediction
The task of link prediction for knowledge graphs is to predict missing relationships between entities. Knowledge graph embedding, which aims to represent entities and relations of a knowledge graph as low dimensional vectors in a continuous vector space, has achieved promising predictive performance. If an embedding model can cover different types of connectivity patterns and mapping properties of relations as many as possible, it will potentially bring more benefits for link prediction tasks. In this paper, we propose a novel embedding model, namely LineaRE, which is capable of modeling four connectivity patterns (i.e., symmetry, antisymmetry, inversion, and composition) and four mapping properties (i.e., one-to-one, one-to-many, many-to-one, and many-to-many) of relations. Specifically, we regard knowledge graph embedding as a simple linear regression task, where a relation is modeled as a linear function of two low-dimensional vector-presented entities with two weight vectors and a bias vector. Since the vectors are defined in a real number space and the scoring function of the model is linear, our model is simple and scalable to large knowledge graphs. Experimental results on multiple widely used real-world datasets show that the proposed LineaRE model significantly outperforms existing state-of-the-art models for link prediction tasks.
DensE: An Enhanced Non-Abelian Group Representation for Knowledge Graph Embedding
Capturing the composition patterns of relations is a vital task in knowledge graph completion. It also serves as a fundamental step towards multi-hop reasoning over learned knowledge. Previously, rotation-based translational methods, e.g., RotatE, have been developed to model composite relations using the product of a series of complex-valued diagonal matrices. However, RotatE makes several oversimplified assumptions on the composition patterns, forcing the relations to be commutative, independent from entities and fixed in scale. To tackle this problem, we have developed a novel knowledge graph embedding method, named DensE, to provide sufficient modeling capacity for complex composition patterns. In particular, our method decomposes each relation into an SO(3) group-based rotation operator and a scaling operator in the three dimensional (3-D) Euclidean space. The advantages of our method are twofold: (1) For composite relations, the corresponding diagonal relation matrices can be non-commutative and related with entity embeddings; (2) It extends the concept of RotatE to a more expressive setting with lower model complexity and preserves the direct geometrical interpretations, which reveals how relations with distinct patterns (i.e., symmetry/anti-symmetry, inversion and composition) are modeled. Experimental results on multiple benchmark knowledge graphs show that DensE outperforms the current state-of-the-art models for missing link prediction, especially on composite relations.
Cognitive Amplifier for Internet of Things
Huang, Bing, Bouguettaya, Athman, Neiat, Azadeh Ghari
With the emergence of IoT, there is a rising interest in applying Internet of Things (IoT) technology in the smart homes for making occupants' life more convenient. The convenience is underpinned by the principle of the least effort, i.e. the premise that humans would usually want to achieve goals with the least cognitive and physical efforts [2]. IoT refers to the networked interconnection of everyday things, which are augmented with capabilities such as sensing, actuating, and communication [21]. The availability of IoT devices including switch sensors, infrared motion sensors, pressure sensor, wearable sensors, accelerators, temperature, humidity, and light sensors have the potential to realize the convenience. It is a challenge that IoT devices are highly diverse in supporting infrastructure such as different programming language and communication protocols [5].
RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space
Sun, Zhiqing, Deng, Zhi-Hong, Nie, Jian-Yun, Tang, Jian
We study the problem of learning representations of entities and relations in knowledge graphs for predicting missing links. The success of such a task heavily relies on the ability of modeling and inferring the patterns of (or between) the relations. In this paper, we present a new approach for knowledge graph embedding called RotatE, which is able to model and infer various relation patterns including: symmetry/antisymmetry, inversion, and composition. Specifically, the RotatE model defines each relation as a rotation from the source entity to the target entity in the complex vector space. In addition, we propose a novel self-adversarial negative sampling technique for efficiently and effectively training the RotatE model. Experimental results on multiple benchmark knowledge graphs show that the proposed RotatE model is not only scalable, but also able to infer and model various relation patterns and significantly outperform existing state-of-the-art models for link prediction. Knowledge graphs are collections of factual triplets, where each triplet (h, r, t) represents a relation r between a head entity h and a tail entity t. Examples of real-world knowledge graphs include Freebase (Bollacker et al., 2008), Yago (Suchanek et al., 2007), and WordNet (Miller, 1995). Knowledge graphs are potentially useful to a variety of applications such as question-answering (Hao et al., 2017), information retrieval (Xiong et al., 2017), recommender systems (Zhang et al., 2016), and natural language processing (Yang & Mitchell, 2017).
Towards Cognitive Automation of Data Science
Biem, Alain (IBM Research) | Butrico, Maria (IBM Research) | Feblowitz, Mark (IBM Research) | Klinger, Tim (IBM Research) | Malitsky, Yuri (IBM Research) | Ng, Kenney (IBM Research) | Perer, Adam (IBM Research) | Reddy, Chandra (IBM Research) | Riabov, Anton (IBM Research) | Samulowitz, Horst (IBM Research) | Sow, Daby (IBM Research) | Tesauro, Gerald (IBM Research) | Turaga, Deepak (IBM Research)
A Data Scientist typically performs a number of tedious and time-consuming steps to derive insight from a raw data set. The process usually starts with data ingestion, cleaning, and transformation (e.g. outlier removal, missing value imputation), then proceeds to model building, and finally a presentation of predictions that align with the end-users objectives and preferences. It is a long, complex, and sometimes artful process requiring substantial time and effort, especially because of the combinatorial explosion in choices of algorithms (and platforms), their parameters, and their compositions. Tools that can help automate steps in this process have the potential to accelerate the time-to-delivery of useful results, expand the reach of data science to non-experts, and offer a more systematic exploration of the available options. This work presents a step towards this goal.