rodler
How should I compute my candidates? A taxonomy and classification of diagnosis computation algorithms
This work proposes a taxonomy for diagnosis computation methods which allows their standardized assessment, classification and comparison. The aim is to (i) give researchers and practitioners an impression of the diverse landscape of available diagnostic techniques, (ii) allow them to easily retrieve the main features as well as pros and cons of the approaches, (iii) enable an easy and clear comparison of the techniques based on their characteristics wrt. a list of important and well-defined properties, and (iv) facilitate the selection of the "right" algorithm to adopt for a particular problem case, e.g., in practical diagnostic settings, for comparison in experimental evaluations, or for reuse, modification, extension, or improvement in the course of research.
Rodler
Reiter's HS-Tree is one of the most popular diagnostic search algorithms due to its desirable properties and general applicability. In sequential diagnosis, where the addressed diagnosis problem is subject to successive change through the acquisition of additional knowledge about the diagnosed system, HS-Tree is used in a stateless fashion. That is, the existing search tree is discarded when new knowledge is obtained, albeit often large parts of the tree are still relevant and have to be rebuilt in the next iteration, involving redundant operations and costly reasoner calls. As a remedy, we propose DynamicHS, a variant of HS-Tree that avoids these redundancy issues by maintaining state throughout sequential diagnosis while preserving all desirable properties of HS-Tree. Evaluations in a problem domain where HS-Tree is the state-of-the-art diagnostic method reveal stable and significant time savings achieved by DynamicHS.
DynamicHS: Streamlining Reiter's Hitting-Set Tree for Sequential Diagnosis
Given a system that does not work as expected, Sequential Diagnosis (SD) aims at suggesting a series of system measurements to isolate the true explanation for the system's misbehavior from a potentially exponential set of possible explanations. To reason about the best next measurement, SD methods usually require a sample of possible fault explanations at each step of the iterative diagnostic process. The computation of this sample can be accomplished by various diagnostic search algorithms. Among those, Reiter's HS-Tree is one of the most popular due its desirable properties and general applicability. Usually, HS-Tree is used in a stateless fashion throughout the SD process to (re)compute a sample of possible fault explanations in each iteration, each time given the latest (updated) system knowledge including all so-far collected measurements. At this, the built search tree is discarded between two iterations, although often large parts of the tree have to be rebuilt in the next iteration, involving redundant operations and calls to costly reasoning services. As a remedy to this, we propose DynamicHS, a variant of HS-Tree that maintains state throughout the diagnostic session and additionally embraces special strategies to minimize the number of expensive reasoner invocations. In this vein, DynamicHS provides an answer to a longstanding question posed by Raymond Reiter in his seminal paper from 1987. Extensive evaluations on real-world diagnosis problems prove the reasonability of the DynamicHS and testify its clear superiority to HS-Tree wrt. computation time. More specifically, DynamicHS outperformed HS-Tree in 96% of the executed sequential diagnosis sessions and, per run, the latter required up to 800% the time of the former. Remarkably, DynamicHS achieves these performance improvements while preserving all desirable properties as well as the general applicability of HS-Tree.
Do We Really Sample Right In Model-Based Diagnosis?
Rodler, Patrick, Elichanova, Fatima
Statistical samples, in order to be representative, have to be drawn from a population in a random and unbiased way. Nevertheless, it is common practice in the field of model-based diagnosis to make estimations from (biased) best-first samples. One example is the computation of a few most probable possible fault explanations for a defective system and the use of these to assess which aspect of the system, if measured, would bring the highest information gain. In this work, we scrutinize whether these statistically not well-founded conventions, that both diagnosis researchers and practitioners have adhered to for decades, are indeed reasonable. To this end, we empirically analyze various sampling methods that generate fault explanations. We study the representativeness of the produced samples in terms of their estimations about fault explanations and how well they guide diagnostic decisions, and we investigate the impact of sample size, the optimal trade-off between sampling efficiency and effectivity, and how approximate sampling techniques compare to exact ones.
The Scheduling Job-Set Optimization Problem: A Model-Based Diagnosis Approach
Rodler, Patrick, Teppan, Erich
A common issue for companies is that the volume of product orders may at times exceed the production capacity. We formally introduce two novel problems dealing with the question which orders to discard or postpone in order to meet certain (timeliness) goals, and try to approach them by means of model-based diagnosis. In thorough analyses, we identify many similarities of the introduced problems to diagnosis problems, but also reveal crucial idiosyncracies and outline ways to handle or leverage them. Finally, a proof-of-concept evaluation on industrial-scale problem instances from a well-known scheduling benchmark suite demonstrates that one of the two formalized problems can be well attacked by out-of-the-box model-based diagnosis tools.
On Expert Behaviors and Question Types for Efficient Query-Based Ontology Fault Localization
We challenge existing query-based ontology fault localization methods wrt. assumptions they make, criteria they optimize, and interaction means they use. We find that their efficiency depends largely on the behavior of the interacting expert, that performed calculations can be inefficient or imprecise, and that used optimization criteria are often not fully realistic. As a remedy, we suggest a novel (and simpler) interaction approach which overcomes all identified problems and, in comprehensive experiments on faulty real-world ontologies, enables a successful fault localization while requiring fewer expert interactions in 66 % of the cases, and always at least 80 % less expert waiting time, compared to existing methods.
Are Query-Based Ontology Debuggers Really Helping Knowledge Engineers?
Rodler, Patrick, Jannach, Dietmar, Schekotihin, Konstantin, Fleiss, Philipp
Real-world semantic or knowledge-based systems, e.g., in the biomedical domain, can become large and complex. Tool support for the localization and repair of faults within knowledge bases of such systems can therefore be essential for their practical success. Correspondingly, a number of knowledge base debugging approaches, in particular for ontology-based systems, were proposed throughout recent years. Query-based debugging is a comparably recent interactive approach that localizes the true cause of an observed problem by asking knowledge engineers a series of questions. Concrete implementations of this approach exist, such as the OntoDebug plug-in for the ontology editor Prot\'eg\'e. To validate that a newly proposed method is favorable over an existing one, researchers often rely on simulation-based comparisons. Such an evaluation approach however has certain limitations and often cannot fully inform us about a method's true usefulness. We therefore conducted different user studies to assess the practical value of query-based ontology debugging. One main insight from the studies is that the considered interactive approach is indeed more efficient than an alternative algorithmic debugging based on test cases. We also observed that users frequently made errors in the process, which highlights the importance of a careful design of the queries that users need to answer.