Goto

Collaborating Authors

 Lenzerini, Maurizio


QDEF and Its Approximations in OBDM

arXiv.org Artificial Intelligence

Given an input dataset (i.e., a set of tuples), query definability in Ontology-based Data Management (OBDM) amounts to find a query over the ontology whose certain answers coincide with the tuples in the given dataset. We refer to such a query as a characterization of the dataset with respect to the OBDM system. Our first contribution is to propose approximations of perfect characterizations in terms of recall (complete characterizations) and precision (sound characterizations). A second contribution is to present a thorough complexity analysis of three computational problems, namely verification (check whether a given query is a perfect, or an approximated characterization of a given dataset), existence (check whether a perfect, or a best approximated characterization of a given dataset exists), and computation (compute a perfect, or best approximated characterization of a given dataset).


Managing Data through the Lens of an Ontology

AI Magazine

While the amount of data stored in current information systems continuously grows, and the processes making use of such data become more and more complex, extracting knowledge and getting insights from these data, as well as governing both data and the associated processes, are still challenging tasks. The problem is complicated by the proliferation of data sources and services both within a single organization, and in cooperating environments. Effectively accessing, integrating and managing data in complex organizations is still one of the main issues faced by the information technology industry today. Indeed, it is not surprising that data scientists spend a comparatively large amount of time in the data preparation phase of a project, compared with the data minining and knowledge discovery phase. Whether you call it data wrangling, data munging, or data integration, it is estimated that 50%-80% of a data scientists time is spent on collecting and organizing data for analysis. If we consider that in any complex organization, data governance is also essential for tasks other than data analytics, we can conclude that the challenge of identifying, gathering, retaining, and providing access to all relevant data for the business at an acceptable cost, is huge.


Data Quality in Ontology-based Data Access: The Case of Consistency

AAAI Conferences

Ontology-based data access (OBDA) is a new paradigm aiming at accessing and managing data by means of an ontology, i.e., a conceptual representation of the domain of interest in the underlying information system. In the last years, this new paradigm has been used for providing users with abstract (independent from technological and system-oriented aspects), effective, and reasoning-intensive mechanisms for querying the data residing at the information system sources. In this paper we argue that OBDA, besides querying data, provides the right principles for devising a formal approach to data quality. In particular, we concentrate on one of the most important dimensions considered both in the literature and in the practice of data quality, namely consistency. We define a general framework for data consistency in OBDA, and present algorithms and complexity analysis for several relevant tasks related to the problem of checking data quality under this dimension, both at the extensional level (content of the data sources), and at the intensional level (schema of the data sources).


Ontology-Based Data Access with Dynamic TBoxes in DL-Lite

AAAI Conferences

In this paper we introduce the notion of mapping-based knowledge base (MKB) to formalize the situation where both the extensional and the intensional level of the ontology are determined by suitable mappings to a set of (relational) data sources. This allows for making the intensional level of the ontology as dynamic as traditionally the extensional level is. To do so, we resort to the meta-modeling capabilities of higher-order Description Logics, which allow us to see concepts and roles as individuals, and vice versa. The challenge in this setting is to design tractable query answering algorithms. Besides the definition of MKBs, our main result is that answering instance queries posed to MKBs expressed in Hi(DL-LiteR) can be done efficiently. In particular, we define a query rewriting technique that produces first-order (SQL) queries to be posed to the data sources.


Higher-Order Description Logics for Domain Metamodeling

AAAI Conferences

We investigate an extension of Description Logics (DL) with higher-order capabilities, based on Henkin-style semantics. Our study starts from the observation that the various possibilities of adding higher-order con- structs to a DL form a spectrum of increasing expres- sive power, including domain metamodeling, i.e., using concepts and roles as predicate arguments. We argue that higher-order features of this type are sufficiently rich and powerful for the modeling requirements aris- ing in many relevant situations, and therefore we carry out an investigation of the computational complexity of satisfiability and conjunctive query answering in DLs extended with such higher-order features. In particular, we show that adding domain metamodeling capabilities to SHIQ (the core of OWL 2) has no impact on the complexity of the various reasoning tasks. This is also true for DL-LiteR (the core of OWL 2 QL) under suit- able restrictions on the queries.


On the evolution of the instance level of DL-lite knowledge bases

arXiv.org Artificial Intelligence

Recent papers address the issue of updating the instance level of knowledge bases expressed in Description Logic following a model-based approach. One of the outcomes of these papers is that the result of updating a knowledge base K is generally not expressible in the Description Logic used to express K. In this paper we introduce a formula-based approach to this problem, by revisiting some research work on formula-based updates developed in the '80s, in particular the WIDTIO (When In Doubt, Throw It Out) approach. We show that our operator enjoys desirable properties, including that both insertions and deletions according to such operator can be expressed in the DL used for the original KB. Also, we present polynomial time algorithms for the evolution of the instance level knowledge bases expressed in the most expressive Description Logics of the DL-lite family.


Node Selection Query Languages for Trees

AAAI Conferences

The study of node-selection query languages for (finite) trees has been a major topic in the recent research on query lan- guages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or monadic second-order logic (MSO), have been considered. Results in this area typically relate an Xpath-based language to a classical logic. What has yet to emerge is an XPath-related language that is expressive as MSO, and at the same time enjoys the computational proper- ties of XPath, which are linear query evaluation and exponen- tial query-containment test. In this paper we propose μXPath, which is the alternation-free fragment of XPath extended with fixpoint operators. Using two-way alternating automata, we show that this language does combine desired expressiveness and computational properties, placing it as an attractive can- didate as the definite query language for trees.