Goto

Collaborating Authors

 Explanation & Argumentation


Explainable AI guided unsupervised fault diagnostics for high-voltage circuit breakers

arXiv.org Artificial Intelligence

Commercial high-voltage circuit breaker (CB) condition monitoring systems rely on directly observable physical parameters such as gas filling pressure with pre-defined thresholds. While these parameters are crucial, they only cover a small subset of malfunctioning mechanisms and usually can be monitored only if the CB is disconnected from the grid. To facilitate online condition monitoring while CBs remain connected, non-intrusive measurement techniques such as vibration or acoustic signals are necessary. Currently, CB condition monitoring studies using these signals typically utilize supervised methods for fault diagnostics, where ground-truth fault types are known due to artificially introduced faults in laboratory settings. This supervised approach is however not feasible in real-world applications, where fault labels are unavailable. In this work, we propose a novel unsupervised fault detection and segmentation framework for CBs based on vibration and acoustic signals. This framework can detect deviations from the healthy state. The explainable artificial intelligence (XAI) approach is applied to the detected faults for fault diagnostics. The specific contributions are: (1) we propose an integrated unsupervised fault detection and segmentation framework that is capable of detecting faults and clustering different faults with only healthy data required during training (2) we provide an unsupervised explainability-guided fault diagnostics approach using XAI to offer domain experts potential indications of the aged or faulty components, achieving fault diagnostics without the prerequisite of ground-truth fault labels. These contributions are validated using an experimental dataset from a high-voltage CB under healthy and artificially introduced fault conditions, contributing to more reliable CB system operation.


Why Do Class-Dependent Evaluation Effects Occur with Time Series Feature Attributions? A Synthetic Data Investigation

arXiv.org Artificial Intelligence

Evaluating feature attribution methods represents a critical challenge in explainable AI (XAI), as researchers typically rely on perturbation-based metrics when ground truth is unavailable. However, recent work reveals that these evaluation metrics can show different performance across predicted classes within the same dataset. These "class-dependent evaluation effects" raise questions about whether perturbation analysis reliably measures attribution quality, with direct implications for XAI method development and evaluation trustworthiness. We investigate under which conditions these class-dependent effects arise by conducting controlled experiments with synthetic time series data where ground truth feature locations are known. We systematically vary feature types and class contrasts across binary classification tasks, then compare perturbation-based degradation scores with ground truth-based precision-recall metrics using multiple attribution methods. Our experiments demonstrate that class-dependent effects emerge with both evaluation approaches, even in simple scenarios with temporally localized features, triggered by basic variations in feature amplitude or temporal extent between classes. Most critically, we find that perturbation-based and ground truth metrics frequently yield contradictory assessments of attribution quality across classes, with weak correlations between evaluation approaches. These findings suggest that researchers should interpret perturbation-based metrics with care, as they may not always align with whether attributions correctly identify discriminating features. By showing this disconnect, our work points toward reconsidering what attribution evaluation actually measures and developing more rigorous evaluation methods that capture multiple dimensions of attribution quality.


Tabular Diffusion based Actionable Counterfactual Explanations for Network Intrusion Detection

arXiv.org Artificial Intelligence

Modern network intrusion detection systems (NIDS) frequently utilize the predictive power of complex deep learning models. However, the "black-box" nature of such deep learning methods adds a layer of opaqueness that hinders the proper understanding of detection decisions, trust in the decisions and prevent timely countermeasures against such attacks. Explainable AI (XAI) methods provide a solution to this problem by providing insights into the causes of the predictions. The majority of the existing XAI methods provide explanations which are not convenient to convert into actionable countermeasures. In this work, we propose a novel diffusion-based counterfactual explanation framework that can provide actionable explanations for network intrusion attacks. We evaluated our proposed algorithm against several other publicly available counterfactual explanation algorithms on 3 modern network intrusion datasets. To the best of our knowledge, this work also presents the first comparative analysis of existing counterfactual explanation algorithms within the context of network intrusion detection systems. Our proposed method provide minimal, diverse counterfactual explanations out of the tested counterfactual explanation algorithms in a more efficient manner by reducing the time to generate explanations. We also demonstrate how counterfactual explanations can provide actionable explanations by summarizing them to create a set of global rules. These rules are actionable not only at instance level but also at the global level for intrusion attacks. These global counterfactual rules show the ability to effectively filter out incoming attack queries which is crucial for efficient intrusion detection and defense mechanisms.


Antithetic Sampling for Top-k Shapley Identification

arXiv.org Artificial Intelligence

Additive feature explanations rely primarily on game-theoretic notions such as the Shapley value by viewing features as cooperating players. The Shapley value's popularity in and outside of explainable AI stems from its axiomatic uniqueness. However, its computational complexity severely limits practicability. Most works investigate the uniform approximation of all features' Shapley values, needlessly consuming samples for insignificant features. In contrast, identifying the $k$ most important features can already be sufficiently insightful and yields the potential to leverage algorithmic opportunities connected to the field of multi-armed bandits. We propose Comparable Marginal Contributions Sampling (CMCS), a method for the top-$k$ identification problem utilizing a new sampling scheme taking advantage of correlated observations. We conduct experiments to showcase the efficacy of our method in compared to competitive baselines. Our empirical findings reveal that estimation quality for the approximate-all problem does not necessarily transfer to top-$k$ identification and vice versa.


DiCE-Extended: A Robust Approach to Counterfactual Explanations in Machine Learning

arXiv.org Artificial Intelligence

Explainable artificial intelligence (XAI) has become increasingly important in decision-critical domains such as healthcare, finance, and law. Counterfactual (CF) explanations, a key approach in XAI, provide users with actionable insights by suggesting minimal modifications to input features that lead to different model outcomes. Despite significant advancements, existing CF generation methods often struggle to balance proximity, diversity, and robustness, limiting their real-world applicability. A widely adopted framework, Diverse Counterfactual Explanations (DiCE), emphasizes diversity but lacks robustness, making CF explanations sensitive to perturbations and domain constraints. To address these challenges, we introduce DiCE-Extended, an enhanced CF explanation framework that integrates multi-objective optimization techniques to improve robustness while maintaining interpretability. Our approach introduces a novel robustness metric based on the Dice-Sørensen coefficient, enabling stability under small input variations. Additionally, we refine CF generation using weighted loss components (lambda_p, lambda_d, lambda_r) to balance proximity, diversity, and robustness. We empirically validate DiCE-Extended on benchmark datasets (COMPAS, Lending Club, German Credit, Adult Income) across multiple ML backends (Scikit-learn, PyTorch, TensorFlow). Results demonstrate improved CF validity, stability, and alignment with decision boundaries compared to standard DiCE-generated explanations. Our findings highlight the potential of DiCE-Extended in generating more reliable and interpretable CFs for high-stakes applications. Future work could explore adaptive optimization techniques and domain-specific constraints to further enhance CF generation in real-world scenarios


Exploiting Constraint Reasoning to Build Graphical Explanations for Mixed-Integer Linear Programming

arXiv.org Artificial Intelligence

Following the recent push for trustworthy AI, there has been an increasing interest in developing contrastive explanation techniques for optimisation, especially concerning the solution of specific decision-making processes formalised as MILPs. Along these lines, we propose X-MILP, a domain-agnostic approach for building contrastive explanations for MILPs based on constraint reasoning techniques. First, we show how to encode the queries a user makes about the solution of an MILP problem as additional constraints. Then, we determine the reasons that constitute the answer to the user's query by computing the Irreducible Infeasible Subsystem (IIS) of the newly obtained set of constraints. Finally, we represent our explanation as a "graph of reasons" constructed from the IIS, which helps the user understand the structure among the reasons that answer their query. We test our method on instances of well-known optimisation problems to evaluate the empirical hardness of computing explanations.


A Survey of Explainable Reinforcement Learning: Targets, Methods and Needs

arXiv.org Artificial Intelligence

The success of recent Artificial Intelligence (AI) models has been accompanied by the opacity of their internal mechanisms, due notably to the use of deep neural networks. In order to understand these internal mechanisms and explain the output of these AI models, a set of methods have been proposed, grouped under the domain of eXplainable AI (XAI). This paper focuses on a sub-domain of XAI, called eXplainable Reinforcement Learning (XRL), which aims to explain the actions of an agent that has learned by reinforcement learning. We propose an intuitive taxonomy based on two questions "What" and "How". The first question focuses on the target that the method explains, while the second relates to the way the explanation is provided. We use this taxonomy to provide a state-of-the-art review of over 250 papers. In addition, we present a set of domains close to XRL, which we believe should get attention from the community. Finally, we identify some needs for the field of XRL.


MUPAX: Multidimensional Problem Agnostic eXplainable AI

arXiv.org Artificial Intelligence

Robust XAI techniques should ideally be simultaneously deterministic, model agnostic, and guaranteed to converge. We propose MULTIDIMENSIONAL PROBLEM AGNOSTIC EXPLAINABLE AI (MUPAX), a deterministic, model agnostic explainability technique, with guaranteed convergency. MUPAX measure theoretic formulation gives principled feature importance attribution through structured perturbation analysis that discovers inherent input patterns and eliminates spurious relationships. We evaluate MUPAX on an extensive range of data modalities and tasks: audio classification (1D), image classification (2D), volumetric medical image analysis (3D), and anatomical landmark detection, demonstrating dimension agnostic effectiveness. The rigorous convergence guarantees extend to any loss function and arbitrary dimensions, making MUPAX applicable to virtually any problem context for AI. By contrast with other XAI methods that typically decrease performance when masking, MUPAX not only preserves but actually enhances model accuracy by capturing only the most important patterns of the original data. Extensive benchmarking against the state of the XAI art demonstrates MUPAX ability to generate precise, consistent and understandable explanations, a crucial step towards explainable and trustworthy AI systems. The source code will be released upon publication.


Robust Explanations Through Uncertainty Decomposition: A Path to Trustworthier AI

arXiv.org Artificial Intelligence

Recent advancements in machine learning have emphasized the need for transparency in model predictions, particularly as interpretability diminishes when using increasingly complex architectures. In this paper, we propose leveraging prediction uncertainty as a complementary approach to classical explainability methods. Specifically, we distinguish between aleatoric (data-related) and epistemic (model-related) uncertainty to guide the selection of appropriate explanations. Epistemic uncertainty serves as a rejection criterion for unreliable explanations and, in itself, provides insight into insufficient training (a new form of explanation). Aleatoric uncertainty informs the choice between feature-importance explanations and counterfactual explanations. This leverages a framework of explainability methods driven by uncertainty quantification and disentanglement. Our experiments demonstrate the impact of this uncertainty-aware approach on the robustness and attainability of explanations in both traditional machine learning and deep learning scenarios.


AKReF: An argumentative knowledge representation framework for structured argumentation

arXiv.org Artificial Intelligence

This paper presents a framework to convert argumentative texts into argument knowledge graphs (AKG). The proposed argumentative knowledge representation framework (AKReF) extends the theoretical foundation and enables the AKG to provide a graphical view of the argumentative structure that is easier to understand. Starting with basic annotations of argumentative components (ACs) and argumentative relations (ARs), we enrich the information by constructing a knowledge base (KB) graph with metadata attributes for nodes. Next, we apply modus ponens on premises and inference rules from the KB to form arguments. From these arguments, we create an AKG. The nodes and edges of the AKG have attributes capturing key argumentative features such as the type of premise (e.g., axiom, ordinary premise, assumption), the type of inference rule (e.g., strict, defeasible), preference order over defeasible rules, markers (e.g., "therefore", "however"), and the type of attack (e.g., undercut, rebuttal, undermining). We identify inference rules by locating a specific set of markers, called inference markers (IM). This, in turn, makes it possible to identify undercut attacks previously undetectable in existing datasets. AKG prepares the ground for reasoning tasks, including checking the coherence of arguments and identifying opportunities for revision. For this, it is essential to find indirect relations, many of which are implicit. Our proposed AKG format, with annotated inference rules and modus ponens, helps reasoning models learn the implicit, indirect relations that require inference over arguments and their interconnections. We use an essay from the AAEC dataset to illustrate the framework. We further show its application in complex analyses such as extracting a conflict-free set and a maximal set of admissible arguments.