Peñaloza, Rafael
A Call for Critically Rethinking and Reforming Data Analysis in Empirical Software Engineering
Esposito, Matteo, Robredo, Mikel, Sridharan, Murali, Travassos, Guilherme Horta, Peñaloza, Rafael, Lenarduzzi, Valentina
Context: Empirical Software Engineering (ESE) drives innovation in SE through qualitative and quantitative studies. However, concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on SE. Objective: To analyze three decades of SE research, identify mistakes in statistical methods, and evaluate experts' ability to detect and address these issues. Methods: We conducted a literature survey of ~27,000 empirical studies, using LLMs to classify statistical methodologies as adequate or inadequate. Additionally, we selected 30 primary studies and held a workshop with 33 ESE experts to assess their ability to identify and resolve statistical issues. Results: Significant statistical issues were found in the primary studies, and experts showed limited ability to detect and correct these methodological problems, raising concerns about the broader ESE community's proficiency in this area. Conclusions. Despite our study's eventual limitations, its results shed light on recurring issues from promoting information copy-and-paste from past authors' works and the continuous publication of inadequate approaches that promote dubious results and jeopardize the spread of the correct statistical strategies among researchers. Besides, it justifies further investigation into empirical rigor in software engineering to expose these recurring issues and establish a framework for reassessing our field's foundation of statistical methodology application. Therefore, this work calls for critically rethinking and reforming data analysis in empirical software engineering, paving the way for our work soon.
Semiring Provenance for Lightweight Description Logics
Bourgaux, Camille, Ozaki, Ana, Peñaloza, Rafael
We investigate semiring provenance--a successful framework originally defined in the relational database setting--for description logics. In this context, the ontology axioms are annotated with elements of a commutative semiring and these annotations are propagated to the ontology consequences in a way that reflects how they are derived. We define a provenance semantics for a language that encompasses several lightweight description logics and show its relationships with semantics that have been defined for ontologies annotated with a specific kind of annotation (such as fuzzy degrees). We show that under some restrictions on the semiring, the semantics satisfies desirable properties (such as extending the semiring provenance defined for databases). We then focus on the well-known why-provenance, which allows to compute the semiring provenance for every additively and multiplicatively idempotent commutative semiring, and for which we study the complexity of problems related to the provenance of an axiom or a conjunctive query answer. Finally, we consider two more restricted cases which correspond to the so-called positive Boolean provenance and lineage in the database setting. For these cases, we exhibit relationships with well-known notions related to explanations in description logics and complete our complexity analysis. As a side contribution, we provide conditions on an ELHI_bot ontology that guarantee tractable reasoning.
Answering Fuzzy Queries over Fuzzy DL-Lite Ontologies
Pasi, Gabriella, Peñaloza, Rafael
A prominent problem in knowledge representation is how to answer queries taking into account also the implicit consequences of an ontology representing domain knowledge. While this problem has been widely studied within the realm of description logic ontologies, it has been surprisingly neglected within the context of vague or imprecise knowledge, particularly from the point of view of mathematical fuzzy logic. In this paper we study the problem of answering conjunctive queries and threshold queries w.r.t. ontologies in fuzzy DL-Lite. Specifically, we show through a rewriting approach that threshold query answering w.r.t. consistent ontologies remains in $AC_0$ in data complexity, but that conjunctive query answering is highly dependent on the selected triangular norm, which has an impact on the underlying semantics. For the idempodent G\"odel t-norm, we provide an effective method based on a reduction to the classical case. This paper is under consideration in Theory and Practice of Logic Programming (TPLP).
Union and Intersection of all Justifications
Chen, Jieying, Ma, Yue, Peñaloza, Rafael, Yang, Hui
We present new algorithms for computing the union and intersection of all justifications for a given ontological consequence without first computing the set of all justifications. Through an empirical evaluation, we show that our approach works well in practice for expressive description logics. In particular, the union of all justifications can be computed much faster than with existing justification-enumeration approaches. We further discuss how to use these results to repair ontologies.
Reasoning with Contextual Knowledge and Influence Diagrams
Acar, Erman, Peñaloza, Rafael
Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We consider a setup where DL axioms hold in some contexts, yet the actual context is uncertain. The framework benefits from the convenience of using DL as a domain knowledge representation language and the modelling strength of IDs to deal with decisions over contexts in the presence of contextual uncertainty. We define related reasoning problems and study their computational complexity.
Probabilistic Temporal Logic over Finite Traces (Technical Report)
Maggi, Fabrizio M., Montali, Marco, Peñaloza, Rafael
Temporal logics over finite traces have recently gained attention due to their use in real-world applications, in particular in business process modelling and planning. In real life, processes contain some degree of uncertainty that is impossible to handle with classical logics. We propose a new probabilistic temporal logic over finite traces based on superposition semantics, where all possible evolutions are possible, until observed. We study the properties of the logic and provide automata-based mechanisms for deriving probabilistic inferences from its formulas. We ground the approach in the context of declarative process modelling, showing how the temporal patterns used in Declare can be lifted to our setting, and discussing how probabilistic inferences can be exploited to provide key offline and runtime reasoning tasks, and how to discover probabilistic Declare patterns from event data by minor adjustments to existing discovery algorithms.
Repairing Ontologies via Axiom Weakening
Troquard, Nicolas (Faculty of Computer Science, Free University of Bozen-Bolzano) | Confalonieri, Roberto (Smart Data Factory, Free University of Bozen-Bolzano) | Galliani, Pietro (Faculty of Computer Science, Free University of Bozen-Bolzano) | Peñaloza, Rafael (Faculty of Computer Science, Free University of Bozen-Bolzano) | Porello, Daniele (Faculty of Computer Science, Free University of Bozen-Bolzano) | Kutz, Oliver (Faculty of Computer Science, Free University of Bozen-Bolzano)
Ontology engineering is a hard and error-prone task, in which small changes may lead to errors, or even produce an inconsistent ontology. As ontologies grow in size, the need for automated methods for repairing inconsistencies while preserving as much of the original knowledge as possible increases. Most previous approaches to this task are based on removing a few axioms from the ontology to regain consistency. We propose a new method based on weakening these axioms to make them less restrictive, employing the use of refinement operators. We introduce the theoretical framework for weakening DL ontologies, propose algorithms to repair ontologies based on the framework, and provide an analysis of the computational complexity. Through an empirical analysis made over real-life ontologies, we show that our approach preserves significantly more of the original knowledge of the ontology than removing axioms.
Minimal Undefinedness for Fuzzy Answer Sets
Alviano, Mario (University of Calabria) | Amendola, Giovanni (University of Calabria) | Peñaloza, Rafael (Free University of Bozen-Bolzano )
Fuzzy Answer Set Programming (FASP) combines the non-monotonic reasoning typical of Answer Set Programming with the capability of Fuzzy Logic to deal with imprecise information and paraconsistent reasoning. In the context of paraconsistent reasoning, the fundamental principle of minimal undefinedness states that truth degrees close to 0 and 1 should be preferred to those close to 0.5, to minimize the ambiguity of the scenario. The aim of this paper is to enforce such a principle in FASP through the minimization of a measure of undefinedness. Algorithms that minimize undefinedness of fuzzy answer sets are presented, and implemented.
Answering Fuzzy Conjunctive Queries over Finitely Valued Fuzzy Ontologies
Borgwardt, Stefan, Mailis, Theofilos, Peñaloza, Rafael, Turhan, Anni-Yasmin
Fuzzy Description Logics (DLs) provide a means for representing vague knowledge about an application domain. In this paper, we study fuzzy extensions of conjunctive queries (CQs) over the DL $\mathcal{SROIQ}$ based on finite chains of degrees of truth. To answer such queries, we extend a well-known technique that reduces the fuzzy ontology to a classical one, and use classical DL reasoners as a black box. We improve the complexity of previous reduction techniques for finitely valued fuzzy DLs, which allows us to prove tight complexity results for answering certain kinds of fuzzy CQs. We conclude with an experimental evaluation of a prototype implementation, showing the feasibility of our approach.