Goto

Collaborating Authors

 Rodler, Patrick


Don't Treat the Symptom, Find the Cause! Efficient Artificial-Intelligence Methods for (Interactive) Debugging

arXiv.org Artificial Intelligence

In the modern world, we are permanently using, leveraging, interacting with, and relying upon systems of ever higher sophistication, ranging from our cars, recommender systems in e-commerce, and networks when we go online, to integrated circuits when using our PCs and smartphones, the power grid to ensure our energy supply, security-critical software when accessing our bank accounts, and spreadsheets for financial planning and decision making. The complexity of these systems coupled with our high dependency on them implies both a non-negligible likelihood of system failures, and a high potential that such failures have significant negative effects on our everyday life. For that reason, it is a vital requirement to keep the harm of emerging failures to a minimum, which means minimizing the system downtime as well as the cost of system repair. This is where model-based diagnosis comes into play. Model-based diagnosis is a principled, domain-independent approach that can be generally applied to troubleshoot systems of a wide variety of types, including all the ones mentioned above, and many more. It exploits and orchestrates i.a. techniques for knowledge representation, automated reasoning, heuristic problem solving, intelligent search, optimization, stochastics, statistics, decision making under uncertainty, machine learning, as well as calculus, combinatorics and set theory to detect, localize, and fix faults in abnormally behaving systems. In this thesis, we will give an introduction to the topic of model-based diagnosis, point out the major challenges in the field, and discuss a selection of approaches from our research addressing these issues.


DynamicHS: Streamlining Reiter's Hitting-Set Tree for Sequential Diagnosis

arXiv.org Artificial Intelligence

Given a system that does not work as expected, Sequential Diagnosis (SD) aims at suggesting a series of system measurements to isolate the true explanation for the system's misbehavior from a potentially exponential set of possible explanations. To reason about the best next measurement, SD methods usually require a sample of possible fault explanations at each step of the iterative diagnostic process. The computation of this sample can be accomplished by various diagnostic search algorithms. Among those, Reiter's HS-Tree is one of the most popular due its desirable properties and general applicability. Usually, HS-Tree is used in a stateless fashion throughout the SD process to (re)compute a sample of possible fault explanations in each iteration, each time given the latest (updated) system knowledge including all so-far collected measurements. At this, the built search tree is discarded between two iterations, although often large parts of the tree have to be rebuilt in the next iteration, involving redundant operations and calls to costly reasoning services. As a remedy to this, we propose DynamicHS, a variant of HS-Tree that maintains state throughout the diagnostic session and additionally embraces special strategies to minimize the number of expensive reasoner invocations. In this vein, DynamicHS provides an answer to a longstanding question posed by Raymond Reiter in his seminal paper from 1987. Extensive evaluations on real-world diagnosis problems prove the reasonability of the DynamicHS and testify its clear superiority to HS-Tree wrt. computation time. More specifically, DynamicHS outperformed HS-Tree in 96% of the executed sequential diagnosis sessions and, per run, the latter required up to 800% the time of the former. Remarkably, DynamicHS achieves these performance improvements while preserving all desirable properties as well as the general applicability of HS-Tree.


Memory-Limited Model-Based Diagnosis

arXiv.org Artificial Intelligence

Various model-based diagnosis scenarios require the computation of most preferred fault explanations. Existing algorithms that are sound (i.e., output only actual fault explanations) and complete (i.e., can return all explanations), however, require exponential space to achieve this task. As a remedy, we propose two novel diagnostic search algorithms, called RBF-HS (Recursive Best-First Hitting Set Search) and HBF-HS (Hybrid Best-First Hitting Set Search), which build upon tried and tested techniques from the heuristic search domain. RBF-HS can enumerate an arbitrary predefined finite number of fault explanations in best-first order within linear space bounds, without sacrificing the desirable soundness or completeness properties. The idea of HBF-HS is to find a trade-off between runtime optimization and a restricted space consumption that does not exceed the available memory. In extensive experiments on real-world diagnosis cases we compared our approaches to Reiter's HS-Tree, a state-of-the-art method that gives the same theoretical guarantees and is as general(ly applicable) as the suggested algorithms. For the computation of minimum-cardinality fault explanations, we find that (1) RBF-HS reduces memory requirements substantially in most cases by up to several orders of magnitude, (2) in more than a third of the cases, both memory savings and runtime savings are achieved, and (3) given the runtime overhead is significant, using HBF-HS instead of RBF-HS reduces the runtime to values comparable with HS-Tree while keeping the used memory reasonably bounded. When computing most probable fault explanations, we observe that RBF-HS tends to trade memory savings more or less one-to-one for runtime overheads. Again, HBF-HS proves to be a reasonable remedy to cut down the runtime while complying with practicable memory bounds.


Do We Really Sample Right In Model-Based Diagnosis?

arXiv.org Artificial Intelligence

Statistical samples, in order to be representative, have to be drawn from a population in a random and unbiased way. Nevertheless, it is common practice in the field of model-based diagnosis to make estimations from (biased) best-first samples. One example is the computation of a few most probable possible fault explanations for a defective system and the use of these to assess which aspect of the system, if measured, would bring the highest information gain. In this work, we scrutinize whether these statistically not well-founded conventions, that both diagnosis researchers and practitioners have adhered to for decades, are indeed reasonable. To this end, we empirically analyze various sampling methods that generate fault explanations. We study the representativeness of the produced samples in terms of their estimations about fault explanations and how well they guide diagnostic decisions, and we investigate the impact of sample size, the optimal trade-off between sampling efficiency and effectivity, and how approximate sampling techniques compare to exact ones.


Sound, Complete, Linear-Space, Best-First Diagnosis Search

arXiv.org Artificial Intelligence

Various model-based diagnosis scenarios require the computation of the most preferred fault explanations. Existing algorithms that are sound (i.e., output only actual fault explanations) and complete (i.e., can return all explanations), however, require exponential space to achieve this task. As a remedy, to enable successful diagnosis on memory-restricted devices and for memory-intensive problem cases, we propose RBF-HS, a diagnostic search method based on Korf's well-known RBFS algorithm. RBF-HS can enumerate an arbitrary fixed number of fault explanations in best-first order within linear space bounds, without sacrificing the desirable soundness or completeness properties. Evaluations using real-world diagnosis cases show that RBF-HS, when used to compute minimum-cardinality fault explanations, in most cases saves substantial space (up to 98 %) while requiring only reasonably more or even less time than Reiter's HS-Tree, a commonly used and as generally applicable sound, complete and best-first diagnosis search.


The Scheduling Job-Set Optimization Problem: A Model-Based Diagnosis Approach

arXiv.org Artificial Intelligence

A common issue for companies is that the volume of product orders may at times exceed the production capacity. We formally introduce two novel problems dealing with the question which orders to discard or postpone in order to meet certain (timeliness) goals, and try to approach them by means of model-based diagnosis. In thorough analyses, we identify many similarities of the introduced problems to diagnosis problems, but also reveal crucial idiosyncracies and outline ways to handle or leverage them. Finally, a proof-of-concept evaluation on industrial-scale problem instances from a well-known scheduling benchmark suite demonstrates that one of the two formalized problems can be well attacked by out-of-the-box model-based diagnosis tools.


Too Good to Throw Away: A Powerful Reuse Strategy for Reiter's Hitting Set Tree

AAAI Conferences

Reiter's HS-Tree is one of the most popular diagnostic search algorithms due to its desirable properties and general applicability. In sequential diagnosis, where the addressed diagnosis problem is subject to successive change through the acquisition of additional knowledge about the diagnosed system, HS-Tree is used in a stateless fashion. That is, the existing search tree is discarded when new knowledge is obtained, albeit often large parts of the tree are still relevant and have to be rebuilt in the next iteration, involving redundant operations and costly reasoner calls. As a remedy, we propose DynamicHS, a variant of HS-Tree that avoids these redundancy issues by maintaining state throughout sequential diagnosis while preserving all desirable properties of HS-Tree. Evaluations in a problem domain where HS-Tree is the state-of-the-art diagnostic method reveal stable and significant time savings achieved by DynamicHS.


Towards Optimizing Reiter's HS-Tree for Sequential Diagnosis

arXiv.org Artificial Intelligence

Reiter's HS-Tree is one of the most popular diagnostic search algorithms due to its desirable properties and general applicability. In sequential diagnosis, where the addressed diagnosis problem is subject to successive change through the acquisition of additional knowledge about the diagnosed system, HS-Tree is used in a stateless fashion. That is, the existing search tree is discarded when new knowledge is obtained, albeit often large parts of the tree are still relevant and have to be rebuilt in the next iteration, involving redundant operations and costly reasoner calls. As a remedy to this, we propose DynamicHS, a variant of HS-Tree that avoids these redundancy issues by maintaining state throughout sequential diagnosis while preserving all desirable properties of HS-Tree. Preliminary results of ongoing evaluations in a problem domain where HS-Tree is the state-of-the-art diagnostic method suggest significant time savings achieved by DynamicHS by reducing expensive reasoner calls.


Are Query-Based Ontology Debuggers Really Helping Knowledge Engineers?

arXiv.org Artificial Intelligence

Real-world semantic or knowledge-based systems, e.g., in the biomedical domain, can become large and complex. Tool support for the localization and repair of faults within knowledge bases of such systems can therefore be essential for their practical success. Correspondingly, a number of knowledge base debugging approaches, in particular for ontology-based systems, were proposed throughout recent years. Query-based debugging is a comparably recent interactive approach that localizes the true cause of an observed problem by asking knowledge engineers a series of questions. Concrete implementations of this approach exist, such as the OntoDebug plug-in for the ontology editor Prot\'eg\'e. To validate that a newly proposed method is favorable over an existing one, researchers often rely on simulation-based comparisons. Such an evaluation approach however has certain limitations and often cannot fully inform us about a method's true usefulness. We therefore conducted different user studies to assess the practical value of query-based ontology debugging. One main insight from the studies is that the considered interactive approach is indeed more efficient than an alternative algorithmic debugging based on test cases. We also observed that users frequently made errors in the process, which highlights the importance of a careful design of the queries that users need to answer.


A New Expert Questioning Approach to More Efficient Fault Localization in Ontologies

arXiv.org Artificial Intelligence

When ontologies reach a certain size and complexity, faults such as inconsistencies, unsatisfiable classes or wrong entailments are hardly avoidable. Locating the incorrect axioms that cause these faults is a hard and time-consuming task. Addressing this issue, several techniques for semi-automatic fault localization in ontologies have been proposed. Often, these approaches involve a human expert who provides answers to system-generated questions about the intended (correct) ontology in order to reduce the possible fault locations. To suggest as informative questions as possible, existing methods draw on various algorithmic optimizations as well as heuristics. However, these computations are often based on certain assumptions about the interacting user. In this work, we characterize and discuss different user types and show that existing approaches do not achieve optimal efficiency for all of them. As a remedy, we suggest a new type of expert question which aims at fitting the answering behavior of all analyzed experts. Moreover, we present an algorithm to optimize this new query type which is fully compatible with the (tried and tested) heuristics used in the field. Experiments on faulty real-world ontologies show the potential of the new querying method for minimizing the expert consultation time, independent of the expert type. Besides, the gained insights can inform the design of interactive debugging tools towards better meeting their users' needs.