We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program, thus providing a novel characterisation of datalog rewritability. Motivated by this result, we propose weakly linear disjunctive datalog -- a novel rule-based KR language that extends both datalog and linear disjunctive datalog and for which reasoning is tractable in data complexity. We then explore applications of weakly linear programs to ontology reasoning and propose a tractable extension of OWL 2 RL with disjunctive axioms. Our empirical results suggest that many non-Horn ontologies can be reduced to weakly linear programs and that query answering over such ontologies using a datalog engine is feasible in practice.
Datalog is one of the best-known rule-based languages, and extensions of it are used in a wide context of applications. An important Datalog extension is Disjunctive Datalog, which significantly increases the expressivity of the basic language. Disjunctive Datalog is useful in a wide range of applications, ranging from Databases (e.g., Data Integration) to Artificial Intelligence (e.g., diagnosis and planning under incomplete knowledge). However, in recent years an important shortcoming of Datalog-based languages became evident, e.g. in the context of data-integration (consistent query-answering, ontology-based data access) and Semantic Web applications: The language does not permit any generation of and reasoning with unnamed individuals in an obvious way. In general, it is weak in supporting many cases of existential quantification. To overcome this problem, Datalogex has recently been proposed, which extends traditional Datalog by existential quantification in rule heads. In this work, we propose a natural extension of Disjunctive Datalog and Datalogex, called Datalogexor, which allows both disjunctions and existential quantification in rule heads and is therefore an attractive language for knowledge representation and reasoning, especially in domains where ontology-based reasoning is needed. We formally define syntax and semantics of the language Datalogexor, and provide a notion of instantiation, which we prove to be adequate for Datalogexor. A main issue of Datalogex and hence also of Datalogexor is that decidability is no longer guaranteed for typical reasoning tasks. In order to address this issue, we identify many decidable fragments of the language, which extend, in a natural way, analog classes defined in the non-disjunctive case. Moreover, we carry out an in-depth complexity analysis, deriving interesting results which range from Logarithmic Space to Exponential Time.
Datalog+/- is a family of ontology languages that combine good computational properties with high expressive power. Datalog+/- languages are provably able to capture the most relevant Semantic Web languages. In this paper we consider the class of weakly-sticky (WS) Datalog+/- programs, which allow for certain useful forms of joins in rule bodies as well as extending the well-known class of weakly-acyclic TGDs. So far, only non-deterministic algorithms were known for answering queries on WS Datalog+/- programs. We present novel deterministic query answering algorithms under WS Datalog+/-. In particular, we propose: (1) a bottom-up grounding algorithm based on a query-driven chase, and (2) a hybrid approach based on transforming a WS program into a so-called sticky one, for which query rewriting techniques are known. We discuss how our algorithms can be optimized and effectively applied for query answering in real-world scenarios.
This paper proposes the use of Constraint Logic Programming (CLP) to model SQL queries in a data-independent abstract layer by focusing on some semantic properties for signalling possible errors in such queries. First, we define a translation from SQL to Datalog, and from Datalog to CLP, so that solving this CLP program will give information about inconsistency, tautology, and possible simplifications. We use different constraint domains which are mapped to SQL types, and propose them to cooperate for improving accuracy. Our approach leverages a deductive system that includes SQL and Datalog, and we present an implementation in this system which is currently being tested in classroom, showing its advantages and differences with respect to other approaches, as well as some performance data. This paper is under consideration for acceptance in TPLP .
We study the problem of multiset prediction. The goal of multiset prediction is to train a predictor that maps an input to a multiset consisting of multiple items. Unlike existing problems in supervised learning, such as classification, ranking and sequence generation, there is no known order among items in a target multiset, and each item in the multiset may appear more than once, making this problem extremely challenging. In this paper, we propose a novel multiset loss function by viewing this problem from the perspective of sequential decision making. The proposed multiset loss function is empirically evaluated on two families of datasets, one synthetic and the other real, with varying levels of difficulty, against various baseline loss functions including reinforcement learning, sequence, and aggregated distribution matching loss functions.