Ontologies
Exact Learning of Lightweight Description Logic Ontologies
Konev, Boris (University of Liverpool) | Lutz, Carsten (University of Bremen) | Ozaki, Ana (University of Liverpool) | Wolter, Frank (University of Liverpool)
We study learning of description logic TBoxes in Angluin et al.’s framework of exact learning via queries. We admit entailment queries (“is a given subsumption entailed by the target TBox?”) and equivalence queries (“is a given TBox equivalent to the target TBox?”), assuming that the signature and logic of the target TBox are known. We present three main results: (1) TBoxes formulated in DL-Lite with role inclusions and composite concepts on the right-hand side of concept inclusions can be learned in polynomial time; (2) EL TBoxes with only concept names on the right-hand side of concept inclusions can be learned in polynomial time; and (3) EL TBoxes cannot be learned in polynomial time. It follows that non-polynomial time learnability of EL TBoxes is caused by the interaction between existential restrictions on the right and left-hand sides of concept inclusions. We also show that neither entailment nor equivalence queries alone are sufficient in cases (1) and (2) above.
Polynomial Combined Rewritings for Existential Rules
Gottlob, Georg (University of Oxford) | Manna, Marco (University of Calabria) | Pieris, Andreas (University of Oxford)
We consider the scenario of ontology-based data access where a conjunctive query is evaluated against a database enriched with intensional knowledge via an ontology. It is generally accepted that true scalability of query answering in this setting can only be achieved by using standard relational database management systems (RDBMSs). An approach to query answering that enables the use of RDBMSs is the so-called polynomial combined approach. We investigate this approach for the main guarded- and sticky-based classes of existential rules, and we highlight the assumptions on the underlying schema which are sufficient for the polynomial combined first-order rewritability of those classes. To the best of our knowledge, this is the first work which explicitly studies the polynomial combined approach for existential rules.
Extract ABox Modules for Efficient Ontology Querying
Xu, Jia, Shironoshita, Patrick, Visser, Ubbo, John, Nigel, Kabuka, Mansur
The extraction of logically-independent fragments out of an ontology ABox can be useful for solving the tractability problem of querying ontologies with large ABoxes. In this paper, we propose a formal definition of an ABox module, such that it guarantees complete preservation of facts about a given set of individuals, and thus can be reasoned independently w.r.t. the ontology TBox. With ABox modules of this type, isolated or distributed (parallel) ABox reasoning becomes feasible, and more efficient data retrieval from ontology ABoxes can be attained. To compute such an ABox module, we present a theoretical approach and also an approximation for $\mathcal{SHIQ}$ ontologies. Evaluation of the module approximation on different types of ontologies shows that, on average, extracted ABox modules are significantly smaller than the entire ABox, and the time for ontology reasoning based on ABox modules can be improved significantly.
Bridging the gap between Legal Practitioners and Knowledge Engineers using semi-formal KR
Ramakrishna, Shashishekar, Paschke, Adrian
The use of Structured English as a computation independent knowledge representation format for non-technical users in business rules representation has been proposed in OMGs Semantics and Business Vocabulary Representation (SBVR). In the legal domain we face a similar problem. Formal representation languages, such as OASIS LegalRuleML and legal ontologies (LKIF, legal OWL2 ontologies etc.) support the technical knowledge engineer and the automated reasoning. But, they can be hardly used directly by the legal domain experts who do not have a computer science background. In this paper we adapt the SBVR Structured English approach for the legal domain and implement a proof-of-concept, called KR4IPLaw, which enables legal domain experts to represent their knowledge in Structured English in a computational independent and hence, for them, more usable way. The benefit of this approach is that the underlying pre-defined semantics of the Structured English approach makes transformations into formal languages such as OASIS LegalRuleML and OWL2 ontologies possible. We exemplify our approach in the domain of patent law.
Efficient Computation of the Well-Founded Semantics over Big Data
Tachmazidis, Ilias, Antoniou, Grigoris, Faber, Wolfgang
Data originating from the Web, sensor readings and social media result in increasingly huge datasets. The so called Big Data comes with new scientific and technological challenges while creating new opportunities, hence the increasing interest in academia and industry. Traditionally, logic programming has focused on complex knowledge structures/programs, so the question arises whether and how it can work in the face of Big Data. In this paper, we examine how the well-founded semantics can process huge amounts of data through mass parallelization. More specifically, we propose and evaluate a parallel approach using the MapReduce framework. Our experimental results indicate that our approach is scalable and that well-founded semantics can be applied to billions of facts. To the best of our knowledge, this is the first work that addresses large scale nonmonotonic reasoning without the restriction of stratification for predicates of arbitrary arity. To appear in Theory and Practice of Logic Programming (TPLP).
Semantic Enrichments in Text Supervised Classification: Application to Medical Domain
Albitar, Shereen (Aix-Marseille Université, LSIS) | Espinasse, Bernard (Aix-Marseille Université, LSIS) | Fournier, Sébastien (Aix-Marseille Université, LSIS)
The use of semantics in supervised text classification can improve its effectiveness especially in specific domains. Most state of the art works use concepts as an alternative to words in order to transform the classical bag of words (BOW) into a Bag of concepts (BOC). This transformation is done through conceptualization task. Furthermore, the resulting BOC can be enriched using other related concepts from semantic resources. This enrichment may enhance classification effectiveness as well. This paper focuses on two strategies for semantic enrichment of conceptualized text representation. The first one is based on semantic kernel method while the second one is based on enriching vectors method. These two semantic enrichment strategies are evaluated through experiments using Rocchio as the supervised classification method in the medical domain, using UMLS ontology and Ohsumed corpus.
A Cookbook for Temporal Conceptual Data Modelling with Description Logics
Artale, Alessandro, Kontchakov, Roman, Ryzhikov, Vladislav, Zakharyaschev, Michael
We design temporal description logics suitable for reasoning about temporal conceptual data models and investigate their computational complexity. Our formalisms are based on DL-Lite logics with three types of concept inclusions (ranging from atomic concept inclusions and disjointness to the full Booleans), as well as cardinality constraints and role inclusions. In the temporal dimension, they capture future and past temporal operators on concepts, flexible and rigid roles, the operators `always' and `some time' on roles, data assertions for particular moments of time and global concept inclusions. The logics are interpreted over the Cartesian products of object domains and the flow of time (Z,<), satisfying the constant domain assumption. We prove that the most expressive of our temporal description logics (which can capture lifespan cardinalities and either qualitative or quantitative evolution constraints) turn out to be undecidable. However, by omitting some of the temporal operators on concepts/roles or by restricting the form of concept inclusions we obtain logics whose complexity ranges between PSpace and NLogSpace. These positive results were obtained by reduction to various clausal fragments of propositional temporal logic, which opens a way to employ propositional or first-order temporal provers for reasoning about temporal data models.
Interactive ontology debugging: two query strategies for efficient fault localization
Shchekotykhin, Kostyantyn, Friedrich, Gerhard, Fleiss, Philipp, Rodler, Patrick
Effective debugging of ontologies is an important prerequisite for their broad application, especially in areas that rely on everyday users to create and maintain knowledge bases, such as the Semantic Web. In such systems ontologies capture formalized vocabularies of terms shared by its users. However in many cases users have different local views of the domain, i.e. of the context in which a given term is used. Inappropriate usage of terms together with natural complications when formulating and understanding logical descriptions may result in faulty ontologies. Recent ontology debugging approaches use diagnosis methods to identify causes of the faults. In most debugging scenarios these methods return many alternative diagnoses, thus placing the burden of fault localization on the user. This paper demonstrates how the target diagnosis can be identified by performing a sequence of observations, that is, by querying an oracle about entailments of the target ontology. To identify the best query we propose two query selection strategies: a simple "split-in-half" strategy and an entropy-based strategy. The latter allows knowledge about typical user errors to be exploited to minimize the number of queries. Our evaluation showed that the entropy-based method significantly reduces the number of required queries compared to the "split-in-half" approach. We experimented with different probability distributions of user errors and different qualities of the a-priori probabilities. Our measurements demonstrated the superiority of entropy-based query selection even in cases where all fault probabilities are equal, i.e. where no information about typical user errors is available.
Natural Language Access to Enterprise Data
Waltinger, Ulli (Siemens AG) | Tecuci, Dan (Siemens Corporation) | Olteanu, Mihaela (Siemens AG) | Mocanu, Vlad (Siemens AG) | Sullivan, Sean (Siemens Energy Inc.)
This paper describes USI Answers — a natural language question answering system for enterprise data. We report on the progress towards the goal of offering easy access to enterprise data to a large number of business users, most of whom are not familiar with the specific syntax or semantics of the underlying data sources. Additional complications come from the nature of the data, which comes both as structured and unstructured. The proposed solution allows users to express questions in natural language, makes apparent the system's interpretation of the query, and allows easy query adjustment and reformulation. The application is in use by more than 1500 users from Siemens Energy. We evaluate our approach on a data set consisting of fleet data.
Shiva: A Framework for Graph Based Ontology Matching
Mathur, Iti, Joshi, Nisheeth, Darbari, Hemant, Kumar, Ajai
Since long, corporations are looking for knowledge sources which can provide structured description of data and can focus on meaning and shared understanding. Structures which can facilitate open world assumptions and can be flexible enough to incorporate and recognize more than one name for an entity. A source whose major purpose is to facilitate human communication and interoperability. Clearly, databases fail to provide these features and ontologies have emerged as an alternative choice, but corporations working on same domain tend to make different ontologies. The problem occurs when they want to share their data/knowledge. Thus we need tools to merge ontologies into one. This task is termed as ontology matching. This is an emerging area and still we have to go a long way in having an ideal matcher which can produce good results. In this paper we have shown a framework to matching ontologies using graphs.