Ontologies
Query-Based Comparison of Mappings in Ontology-Based Data Access
Bienvenu, Meghyn (Centre National de la Recherche Scientifique (CNRS)) | Rosati, Riccardo (Sapienza Università di Roma)
An ontology-based data access (OBDA) system is composed of one or more data sources, an ontology that provides a conceptual view of the data, and declarative mappings that relate the data and ontology schemas. In order to debug and optimize such systems, it is important to be able to analyze and compare OBDA specifications. Recent work in this direction compared specifications using classical notions of equivalence and entailment, but an interesting alternative is to consider query-based notions, in which two specifications are deemed equivalent if they give the same answers to the considered query or class of queries for all possible data sources. In this paper, we define such query-based notions of entailment and equivalence of OBDA specifications and investigate the complexity of the resulting analysis tasks when the ontology is formulated in (fragments of) DL-Lite R .
Ontology Instance Linking: Towards Interlinked Knowledge Graphs
Heflin, Jeff (Lehigh University) | Song, Dezhao (Thomson Reuters)
Due to the decentralized nature of the Semantic Web, the same real-world entity may be described in various data sources with different ontologies and assigned syntactically distinct identifiers. In order to facilitate data utilization and consumption in the Semantic Web, without compromising the freedom of people to publish their data, one critical problem is to appropriately interlink such heterogeneous data. This interlinking process is sometimes referred to as Entity Coreference, i.e., finding which identifiers refer to the same real-world entity. In this paper, we first summarize state-of-the-art algorithms in detecting such coreference relationships between ontology instances. We then discuss various techniques in scaling entity coreference to large-scale datasets. Finally, we present well-adopted evaluation datasets and metrics, and compare the performance of the state-of-the-art algorithms on such datasets.
Query Answering with Inconsistent Existential Rules under Stable Model Semantics
Wan, Hai (Sun Yat-sen University) | Zhang, Heng (Huazhong University of Science and Technology) | Xiao, Peng (Sun Yat-sen University) | Huang, Haoran (Fudan University ) | Zhang, Yan (Western Sydney University)
Classical inconsistency-tolerant query answering relies on selecting maximal components of an ABox/database which are consistent with the ontology. However, some rules in ontologies might be unreliable if they are extracted from ontology learning or written by unskillful knowledge engineers. In this paper we present a framework of handling inconsistent existential rules under stable model semantics, which is defined by a notion called rule repairs to select maximal components of the existential rules. Surprisingly, for R-acyclic existential rules with R-stratified or guarded existential rules with stratified negations, both the data complexity and combined complexity of query answering under the rule repair semantics remain the same as that under the conventional query answering semantics. This leads us to propose several approaches to handle the rule repair semantics by calling answer set programming solvers. An experimental evaluation shows that these approaches have good scalability of query answering under rule repairs on realistic cases.
Logical Foundations of Privacy-Preserving Publishing of Linked Data
Grau, Bernardo Cuenca (University of Oxford) | Kostylev, Egor V. (University of Oxford)
The widespread adoption of Linked Data has been driven by the increasing demand for information exchange between organisations, as well as by data publishing regulations in domains such as health care and governance. In this setting, sensitive information is at risk of disclosure since published data can be linked with arbitrary external data sources. In this paper we lay the foundations of privacy-preserving data publishing (PPDP) in the context of Linked Data. We consider anonymisations of RDF graphs (and, more generally, relational datasets with labelled nulls) and define notions of safe and optimal anonymisations. Safety ensures that the anonymised data can be published with provable protection guarantees against linking attacks, whereas optimality ensures that it preserves as much information from the original data as possible, while satisfying the safety requirement. We establish the complexity of the underpinning decision problems both under open-world semantics inherent to RDF and a closed-world semantics, where we assume that an attacker has complete knowledge over some part of the original data.
Column-Oriented Datalog Materialization for Large Knowledge Graphs
Urbani, Jacopo (Vrije Universiteit Amsterdam) | Jacobs, Ceriel (Vrije Universiteit Amsterdam) | Krötzsch, Markus (Technische Universität Dresden)
The evaluation of Datalog rules over large Knowledge Graphs (KGs) is essential for many applications. In this paper, we present a new method of materializing Datalog inferences, which combines a column-based memory layout with novel optimization methods that avoid redundant inferences at runtime. The pro-active caching of certain subqueries further increases efficiency. Our empirical evaluation shows that this approach can often match or even surpass the performance of state-of-the-art systems, especially under restricted resources.
Basic Probabilistic Ontological Data Exchange with Existential Rules
Lukasiewicz, Thomas (University of Oxford) | Martinez, Maria Vanina (Universidad Nacional del Sur-CONICET) | Predoiu, Livia (University of Oxford) | Simari, Gerardo I. (Universidad Nacional del Sur-CONICET)
We study the complexity of exchanging probabilistic data between ontology-based probabilistic databases. We consider the Datalog+/- family of languages as ontology and ontology mapping languages, and we assume different compact encodings of the probabilities of the probabilistic source databases via Boolean events. We provide an extensive complexity analysis of the problem of deciding the existence of a probabilistic (universal) solution for a given probabilistic source database relative to a (probabilistic) data exchange problem for the different languages considered.
A Joint Model for Question Answering over Multiple Knowledge Bases
Zhang, Yuanzhe (Institute of Automation, Chinese Academy of Sciences) | He, Shizhu (Institute of Automation, Chinese Academy of Sciences) | Liu, Kang (Institute of Automation, Chinese Academy of Sciences) | Zhao, Jun (Institute of Automation, Chinese Academy of Sciences)
As the amount of knowledge bases (KBs) grows rapidly, the problem of question answering (QA) over multiple KBs has drawn more attention. The most significant distinction between multiple KB-QA and single KB-QA is that the former must consider the alignments between KBs. The pipeline strategy first constructs the alignments independently, and then uses the obtained alignments to construct queries. However, alignment construction is not a trivial task, and the introduced noises would be passed on to query construction. By contrast, we notice that alignment construction and query construction are interactive steps, and jointly considering them would be beneficial. To this end, we present a novel joint model based on integer linear programming (ILP), uniting these two procedures into a uniform framework. The experimental results demonstrate that the proposed approach outperforms state-of-the-art systems, and is able to improve the performance of both alignment construction and query construction.
Ontology-Mediated Queries for NOSQL Databases
Mugnier, Marie-Laure (Université de Montpellier) | Rousset, Marie-Christine (Université ́Grenoble University) | Ulliana, Federico (Universite ́ de Montpellier)
Today, the main applications of OBDA SQL) defines a broad collection of languages. Keyvalue can be found in data integration as well as in querying the stores are NOSQL systems adopting the data model of Semantic Web. The interest of OBDA is to allow the users to key-value records (also called JSON records). These records ask queries on high-level ontology vocabularies and to delegate are processed on distributed systems, but also increasingly to algorithms (1) the reformulation of these high-level exchanged on the Web thereby replacing semistructured queries into a set of low-level databases queries, (2) the efficient XML data and many RDF formats (see JSON-LD (Sporny computation of their answers by native data management et al. 2004)). Key-value records are non-first normal forms systems in which data is stored and indexed, and (3) where values are not only atomic (in contrast with relational the combination of these answers in order to obtain the final databases) and nesting is possible (Abiteboul, Hull, answers to the users' query. The advantage of OBDA is and Vianu 1995).
A Model for Learning Description Logic Ontologies Based on Exact Learning
Konev, Boris (University of Liverpool) | Ozaki, Ana (University of Liverpool) | Wolter, Frank (University of Liverpool)
We investigate the problem of learning description logic (DL) ontologies in Angluin et al.’s framework of exact learning via queries posed to an oracle. We consider membership queries of the form “is a tuple a of individuals a certain answer to a data retrieval query q in a given ABox and the unknown target ontology?” and completeness queries of the form “does a hypothesis ontology entail the unknown target ontology?” Given a DL L and a data retrieval query language Q, we study polynomial learnability of ontologies in L using data retrieval queries in Q and provide an almost complete classification for DLs that are fragments of EL with role inclusions and of DL-Lite and for data retrieval queries that range from atomic queries and EL/ELI-instance queries to conjunctive queries. Some results are proved by non-trivial reductions to learning from subsumption examples.
On the Containment of SPARQL Queries under Entailment Regimes
Chekol, Melisachew Wudage (University of Mannheim)
Most description logics (DL) query languages allow instance retrieval from an ABox. However, SPARQL is a schema query language allowing access to the TBox (in addition to the ABox). Moreover, its entailment regimes enable to take into account knowledge inferred from knowledge bases in the query answering process. This provides a new perspective for the containment problem. In this paper, we study the containment of SPARQL queries over OWL EL axioms under entailment. OWL EL is the language used by many large scale ontologies and is based on EL ++ . The main contribution is a novel approach to rewriting queries using SPARQL property paths and the μ-calculus in order to reduce containment test under entailment into validity check in the μ-calculus.