University of Mannheim
On Multi-Relational Link Prediction With Bilinear Models
Wang, Yanjie (University of Mannheim) | Gemulla, Rainer (University of Mannheim) | Li, Hui (The University of Hong Kong)
We study bilinear embedding models for the task of multi-relational link prediction and knowledge graph completion. Bilinear models belong to the most basic models for this task, they are comparably efficient to train and use, and they can provide good prediction performance. The main goal of this paper is to explore the expressiveness of and the connections between various bilinear models proposed in the literature. In particular, a substantial number of models can be represented as bilinear models with certain additional constraints enforced on the embeddings. We explore whether or not these constraints lead to universal models, which can in principle represent every set of relations, and whether or not there are subsumption relationships between various models. We report results of an independent experimental study that evaluates recent bilinear models in a common experimental setup. Finally, we provide evidence that relation-level ensembles of multiple bilinear models can achieve state-of-the-art prediction performance.
Marrying Uncertainty and Time in Knowledge Graphs
Chekol, Melisachew Wudage (University of Mannheim) | Pirrรฒ, Giuseppe (ICAR-CNR) | Schoenfisch, Joerg (University of Mannheim) | Stuckenschmidt, Heiner (University of Mannheim)
The management of uncertainty is crucial when harvesting structured content from unstructured and noisy sources. Knowledge Graphs ( KGs ) are a prominent example. KGs maintain both numerical and non-numerical facts, with the support of an underlying schema. These facts are usually accompanied by a confidence score that witnesses how likely is for them to hold. Despite their popularity, most of existing KGs focus on static data thus impeding the availabilityof timewise knowledge. What is missing is a comprehensive solution for the management of uncertain and temporal data in KGs . The goal of this paper is to fill this gap. We rely on two main ingredients. The first is a numerical extension of Markov Logic Networks (MLNs) that provide the necessary underpinning to formalize the syntax and semantics of uncertain temporal KGs . The second is a set of Datalog constraints with inequalities that extend the underlying schema of the KGs and help to detect inconsistencies. From a theoretical point of view, we discuss the complexity of two important classes of queries for uncertain temporal KGs: maximuma-posteriori and conditional probability inference. Due to the hardness of these problems and the fact that MLN solvers do not scale well, we also explore the usage of Probabilistic Soft Logics (PSL) as a practical tool to support our reasoning tasks. We report on an experimental evaluation comparing the MLN and PSL approaches.
On the Containment of SPARQL Queries under Entailment Regimes
Chekol, Melisachew Wudage (University of Mannheim)
Most description logics (DL) query languages allow instance retrieval from an ABox. However, SPARQL is a schema query language allowing access to the TBox (in addition to the ABox). Moreover, its entailment regimes enable to take into account knowledge inferred from knowledge bases in the query answering process. This provides a new perspective for the containment problem. In this paper, we study the containment of SPARQL queries over OWL EL axioms under entailment. OWL EL is the language used by many large scale ontologies and is based on EL ++ . The main contribution is a novel approach to rewriting queries using SPARQL property paths and the ฮผ-calculus in order to reduce containment test under entailment into validity check in the ฮผ-calculus.
Event-Based Clustering for Reducing Labeling Costs of Event-related Microposts
Schulz, Axel (DB Mobiliy Logistics AG and Technische Universitรคt Darmstadt) | Janssen, Frederik (Technische Universitรคt Darmstadt) | Ristoski, Petar (University of Mannheim) | Fรผrnkranz, Johannes (Technische Universitรคt Darmstadt)
Automatically identifying the event type of event-related information in the sheer amount of social media data makes machine learning inevitable. However, this is highly dependent on (1) the number of correctly labeled instances and (2) labeling costs. Active learning has been proposed to reduce the number of instances to label. Albeit the thematic dimension is already used, other metadata such as spatial and temporal information that is helpful for achieving a more fine-grained clustering is currently not taken into account. In this paper, we present a novel event-based clustering strategy that makes use of temporal, spatial, and thematic metadata to determine instances to label. An evaluation on incident-related tweets shows that our selection strategy for active learning outperforms current state-of-the-art approaches even with few labeled instances.
Correlation-Based Refinement of Rules with Numerical Attributes
Melo, Andre (University of Mannheim) | Theobald, Martin (University of Antwerp) | Vรถlker, Johanna (University of Mannheim)
Learning rules is a common way of extracting usefulinformation from knowledge or data bases. Many ofsuch data sets contain numerical attributes. However,approaches like ILP or association rule mining are optimizedfor data with categorical values, and consideringnumerical attributes is expensive. In this paper,we present an extension to the top-down ILP algorithm,which enables an efficient discovery of datalogrules from data with both numerical and categorical attributes.Our approach comprises a preprocessing phasefor computing the correlations between numerical andcategorical attributes, as well as an extension to the ILPrefinement step, which enables us to detect interestingcandidate rules and to suggest refinements with relevantattribute combinations. We report on experiments withU.S. Census data, Freebase and DBpedia, and show thatour approach helps to efficiently discover rules with numericalintervals.
RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models
Noessner, Jan (University of Mannheim) | Niepert, Mathias (University of Washington) | Stuckenschmidt, Heiner (University of Mannheim)
RockIt is a maximum a-posteriori (MAP) query engine for statistical relational models. MAP inference in graphical models is an optimization problem which can be compiled to integer linear programs (ILPs).We describe several advances in translating MAP queries to ILP instances and present the novel meta-algorithm cutting plane aggregation (CPA). CPA exploits local context-specific symmetries and bundles up sets of linear constraints. The resulting counting constraints lead to more compact ILPs and make the symmetry of the ground model more explicit to state-of-the-art ILP solvers. Moreover, RockIt parallelizes most parts of the MAP inference pipeline taking advantage of ubiquitous shared-memory multi-core architectures. We report on extensive experiments with Markov logic network (MLN) benchmarks showing that RockIt outperforms the state-of-the-art systems Alchemy, Markov TheBeast, and Tuffy both in terms of efficiency and quality of results.
A Probabilistic-Logical Framework for Ontology Matching
Niepert, Mathias (University of Mannheim) | Meilicke, Christian (University of Mannheim) | Stuckenschmidt, Heiner (University of Mannheim)
Ontology matching is the problem of determining correspondences between concepts, properties, and individuals of different heterogeneous ontologies. With this paper we present a novel probabilistic-logical framework for ontology matching based on Markov logic. We define the syntax and semantics and provide a formalization of the ontology matching problem within the framework. The approach has several advantages over existing methods such as ease of experimentation, incoherence mitigation during the alignment process, and the incorporation of a-priori confidence values. We show empirically that the approach is efficient and more accurate than existing matchers on an established ontology alignment benchmark dataset.