Ontologies
Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF Systems
Motik, Boris (Oxford University) | Nenov, Yavor (Oxford University) | Piro, Robert (Oxford University) | Horrocks, Ian (Oxford University) | Olteanu, Dan (Oxford University)
We present a novel approach to parallel materialisation (i.e., fixpoint computation) of datalog programs in centralised, main-memory, multi-core RDF systems. Our approach comprises an algorithm that evenly distributes the workload to cores, and an RDF indexing data structure that supports efficient, 'mostly' lock-free parallel updates. Our empirical evaluation shows that our approach parallelises computation very well: with 16 physical cores, materialisation can be up to 13.9 times faster than with just one core.
Abduction Framework for Repairing Incomplete EL Ontologies: Complexity Results and Algorithms
Wei-Kleiner, Fang (Linköping University) | Dragisic, Zlatan (Linköping University) | Lambrix, Patrick (Linköping University)
In this paper we consider the problem of repairing missing is-a relations in ontologies. We formalize the problem as a generalized TBox abduction problem (GTAP). Based on this abduction framework, we present complexity results for the existence, relevance and necessity decision problems for the GTAP with and without some specific preference relations for ontologies that can be represented using a member of the EL family of description logics. Further, we present algorithms for finding solutions, a system as well as experiments.
Datalog Rewritability of Disjunctive Datalog Programs and its Applications to Ontology Reasoning
Kaminski, Mark (University of Oxford) | Nenov, Yavor (University of Oxford) | Grau, Bernardo Cuenca (University of Oxford)
We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program, thus providing a novel characterisation of datalog rewritability. Motivated by this result, we propose weakly linear disjunctive datalog -- a novel rule-based KR language that extends both datalog and linear disjunctive datalog and for which reasoning is tractable in data complexity. We then explore applications of weakly linear programs to ontology reasoning and propose a tractable extension of OWL 2 RL with disjunctive axioms. Our empirical results suggest that many non-Horn ontologies can be reduced to weakly linear programs and that query answering over such ontologies using a datalog engine is feasible in practice.
XML Matchers: approaches and challenges
Agreste, Santa, De Meo, Pasquale, Ferrara, Emilio, Ursino, Domenico
Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.
Ontology-Based Monitoring of Dynamic Systems
Our understanding of the notion "dynamic system" is a rather broad one: such a system has states, which can change over time. Ontologies are used to describe the states of the system, possibly in an incomplete way. Monitoring is then concerned with deciding whether some run of the system or all of its runs satisfy a certain property, which can be expressed by a formula of an appropriate temporal logic. We consider different instances of this broad framework, which can roughly be classified into two cases. In one instance, the system is assumed to be a black box, whose inner working is not known, but whose states can be (partially) observed during a run of the system. In the second instance, one has (partial) knowledge about the inner working of the system, which provides information on which runs of the system are possible. In this paper, we will review some of our recent work that can be seen as instances of this general framework of ontology-based monitoring of dynamic systems. We will also mention possible extensions towards probabilistic reasoning and the integration of mathematical modeling of dynamical systems.
Extract ABox Modules for Efficient Ontology Querying
Xu, Jia, Shironoshita, Patrick, Visser, Ubbo, John, Nigel, Kabuka, Mansur
The extraction of logically-independent fragments out of an ontology ABox can be useful for solving the tractability problem of querying ontologies with large ABoxes. In this paper, we propose a formal definition of an ABox module, such that it guarantees complete preservation of facts about a given set of individuals, and thus can be reasoned independently w.r.t. the ontology TBox. With ABox modules of this type, isolated or distributed (parallel) ABox reasoning becomes feasible, and more efficient data retrieval from ontology ABoxes can be attained. To compute such an ABox module, we present a theoretical approach and also an approximation for $\mathcal{SHIQ}$ ontologies. Evaluation of the module approximation on different types of ontologies shows that, on average, extracted ABox modules are significantly smaller than the entire ABox, and the time for ontology reasoning based on ABox modules can be improved significantly.
Bridging the gap between Legal Practitioners and Knowledge Engineers using semi-formal KR
Ramakrishna, Shashishekar, Paschke, Adrian
The use of Structured English as a computation independent knowledge representation format for non-technical users in business rules representation has been proposed in OMGs Semantics and Business Vocabulary Representation (SBVR). In the legal domain we face a similar problem. Formal representation languages, such as OASIS LegalRuleML and legal ontologies (LKIF, legal OWL2 ontologies etc.) support the technical knowledge engineer and the automated reasoning. But, they can be hardly used directly by the legal domain experts who do not have a computer science background. In this paper we adapt the SBVR Structured English approach for the legal domain and implement a proof-of-concept, called KR4IPLaw, which enables legal domain experts to represent their knowledge in Structured English in a computational independent and hence, for them, more usable way. The benefit of this approach is that the underlying pre-defined semantics of the Structured English approach makes transformations into formal languages such as OASIS LegalRuleML and OWL2 ontologies possible. We exemplify our approach in the domain of patent law.
Efficient Computation of the Well-Founded Semantics over Big Data
Tachmazidis, Ilias, Antoniou, Grigoris, Faber, Wolfgang
Data originating from the Web, sensor readings and social media result in increasingly huge datasets. The so called Big Data comes with new scientific and technological challenges while creating new opportunities, hence the increasing interest in academia and industry. Traditionally, logic programming has focused on complex knowledge structures/programs, so the question arises whether and how it can work in the face of Big Data. In this paper, we examine how the well-founded semantics can process huge amounts of data through mass parallelization. More specifically, we propose and evaluate a parallel approach using the MapReduce framework. Our experimental results indicate that our approach is scalable and that well-founded semantics can be applied to billions of facts. To the best of our knowledge, this is the first work that addresses large scale nonmonotonic reasoning without the restriction of stratification for predicates of arbitrary arity. To appear in Theory and Practice of Logic Programming (TPLP).
Semantic Enrichments in Text Supervised Classification: Application to Medical Domain
Albitar, Shereen (Aix-Marseille Université, LSIS) | Espinasse, Bernard (Aix-Marseille Université, LSIS) | Fournier, Sébastien (Aix-Marseille Université, LSIS)
The use of semantics in supervised text classification can improve its effectiveness especially in specific domains. Most state of the art works use concepts as an alternative to words in order to transform the classical bag of words (BOW) into a Bag of concepts (BOC). This transformation is done through conceptualization task. Furthermore, the resulting BOC can be enriched using other related concepts from semantic resources. This enrichment may enhance classification effectiveness as well. This paper focuses on two strategies for semantic enrichment of conceptualized text representation. The first one is based on semantic kernel method while the second one is based on enriching vectors method. These two semantic enrichment strategies are evaluated through experiments using Rocchio as the supervised classification method in the medical domain, using UMLS ontology and Ohsumed corpus.
A Cookbook for Temporal Conceptual Data Modelling with Description Logics
Artale, Alessandro, Kontchakov, Roman, Ryzhikov, Vladislav, Zakharyaschev, Michael
We design temporal description logics suitable for reasoning about temporal conceptual data models and investigate their computational complexity. Our formalisms are based on DL-Lite logics with three types of concept inclusions (ranging from atomic concept inclusions and disjointness to the full Booleans), as well as cardinality constraints and role inclusions. In the temporal dimension, they capture future and past temporal operators on concepts, flexible and rigid roles, the operators `always' and `some time' on roles, data assertions for particular moments of time and global concept inclusions. The logics are interpreted over the Cartesian products of object domains and the flow of time (Z,<), satisfying the constant domain assumption. We prove that the most expressive of our temporal description logics (which can capture lifespan cardinalities and either qualitative or quantitative evolution constraints) turn out to be undecidable. However, by omitting some of the temporal operators on concepts/roles or by restricting the form of concept inclusions we obtain logics whose complexity ranges between PSpace and NLogSpace. These positive results were obtained by reduction to various clausal fragments of propositional temporal logic, which opens a way to employ propositional or first-order temporal provers for reasoning about temporal data models.