Goto

Collaborating Authors

 Explanation & Argumentation


UGCE: User-Guided Incremental Counterfactual Exploration

arXiv.org Artificial Intelligence

-- Counterfactual explanations (CFEs) are a popular approach for interpreting machine learning predictions by identifying minimal feature changes that alter model outputs. However, in real-world settings, users often refine feasibility constraints over time, requiring counterfactual generation to adapt dynamically. Existing methods fail to support such iterative updates, instead recomputing explanations from scratch with each change, an inefficient and rigid approach. We propose User-Guided Incremental Counterfactual Exploration (UGCE), a genetic algorithm-based framework that incrementally updates counterfactuals in response to evolving user constraints. Experimental results across five benchmark datasets demonstrate that UGCE significantly improves computational efficiency while maintaining high-quality solutions compared to a static, non-incremental approach. Our evaluation further shows that UGCE supports stable performance under varying constraint sequences, benefits from an efficient warm-start strategy, and reveals how different constraint types may affect search behavior . I NTRODUCTION Machine learning (ML) models are increasingly deployed in high-stakes decision-making domains, including lending, college admissions, and hiring, where their predictions influence critical life outcomes [1]-[3]. However, these models often function as black boxes, making it difficult for stakeholders to understand the rationale behind predictions, particularly when an unfavorable decision is made. This lack of transparency has driven the need for explanation techniques that help users interpret and contest automated decisions [4], [5]. As a result, research on explainability has gained significant traction, leading to a wide array of methodologies aimed at making ML models more transparent and interpretable [5]-[13].


Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory

Neural Information Processing Systems

Counterfactual explanations provide ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be leveraged to reconstruct the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we analyze how model reconstruction using counterfactuals can be improved byfurther leveraging the fact that the counterfactuals also lie quite close to the decision boundary. Our main contribution is to derive novel theoretical relationships between the error in model reconstruction and the number of counterfactual queries required using polytope theory. Our theoretical analysis leads us to propose a strategy for model reconstruction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals differently than ordinary instances.


Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics

Neural Information Processing Systems

Explainable AI (XAI) is a rapidly growing domain with a myriad of proposed methods as well as metrics aiming to evaluate their efficacy. However, current studies are often of limited scope, examining only a handful of XAI methods and ignoring underlying design parameters for performance, such as the model architecture or the nature of input data. Moreover, they often rely on one or a few metrics and neglect thorough validation, increasing the risk of selection bias and ignoring discrepancies among metrics. These shortcomings leave practitioners confused about which method to choose for their problem. In response, we introduce LATEC, a large-scale benchmark that critically evaluates 17 prominent XAI methods using 20 distinct metrics.


Explanation User Interfaces: A Systematic Literature Review

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) is one of the major technological advancements of this century, bearing incredible potential for users through AI-powered applications and tools in numerous domains. Being often black-box (i.e., its decision-making process is unintelligible), developers typically resort to eXplainable Artificial Intelligence (XAI) techniques to interpret the behaviour of AI models to produce systems that are transparent, fair, reliable, and trustworthy. However, presenting explanations to the user is not trivial and is often left as a secondary aspect of the system's design process, leading to AI systems that are not useful to end-users. This paper presents a Systematic Literature Review on Explanation User Interfaces (XUIs) to gain a deeper understanding of the solutions and design guidelines employed in the academic literature to effectively present explanations to users. To improve the contribution and real-world impact of this survey, we also present a framework for Human-cEnteRed developMent of Explainable user interfaceS (HERMES) to guide practitioners and academics in the design and evaluation of XUIs.


A New Approach to Backtracking Counterfactual Explanations: A Unified Causal Framework for Efficient Model Interpretability

arXiv.org Machine Learning

Counterfactual explanations enhance interpretability by identifying alternative inputs that produce different outputs, offering localized insights into model decisions. However, traditional methods often neglect causal relationships, leading to unrealistic examples. While newer approaches integrate causality, they are computationally expensive. To address these challenges, we propose an efficient method called BRACE based on backtracking counterfactuals that incorporates causal reasoning to generate actionable explanations. We first examine the limitations of existing methods and then introduce our novel approach and its features. We also explore the relationship between our method and previous techniques, demonstrating that it generalizes them in specific scenarios. Finally, experiments show that our method provides deeper insights into model outputs.


Relevance for Stability of Verification Status of a Set of Arguments in Incomplete Argumentation Frameworks (with Proofs)

arXiv.org Artificial Intelligence

The notion of relevance was proposed for stability of justification status of a single argument in incomplete argumentation frameworks (IAFs) in 2024 by Odekerken et al. To extend the notion, we study the relevance for stability of verification status of a set of arguments in this paper, i.e., the uncertainties in an IAF that have to be resolved in some situations so that answering whether a given set of arguments is an extension obtains the same result in every completion of the IAF. Further we propose the notion of strong relevance for describing the necessity of resolution in all situations reaching stability. An analysis of complexity reveals that detecting the (strong) relevance for stability of sets of arguments can be accomplished in P time under the most semantics discussed in the paper. We also discuss the difficulty in finding tractable methods for relevance detection under grounded semantics.


Aligning Explanations with Human Communication

arXiv.org Machine Learning

Machine learning explainability aims to make the decision-making process of black-box models more transparent by finding the most important input features for a given prediction task. Recent works have proposed composing explanations from semantic concepts (e.g., colors, patterns, shapes) that are inherently interpretable to the user of a model. However, these methods generally ignore the communicative context of explanation-the ability of the user to understand the prediction of the model from the explanation. For example, while a medical doctor might understand an explanation in terms of clinical markers, a patient may need a more accessible explanation to make sense of the same diagnosis. In this paper, we address this gap with listener-adaptive explanations. We propose an iterative procedure grounded in principles of pragmatic reasoning and the rational speech act to generate explanations that maximize communicative utility. Our procedure only needs access to pairwise preferences between candidate explanations, relevant in real-world scenarios where a listener model may not be available. We evaluate our method in image classification tasks, demonstrating improved alignment between explanations and listener preferences across three datasets. Furthermore, we perform a user study that demonstrates our explanations increase communicative utility.


Explainable AI for Securing Healthcare in IoT-Integrated 6G Wireless Networks

arXiv.org Artificial Intelligence

As healthcare systems increasingly rely on advanced wireless networks and connected devices, ensuring the security of medical applications has become a critical concern. The integration of Internet of Medical Things (IoMT) devices with real - time health monitoring and care delivery has revolutionized patient care but has also introduced new security vulnerabilities. Each connected device, whether it is a part of a robotic surgical arm, intensive care equipment, or a wearable health monitor, serves as a poten tial entry point for cyberattacks. Such vulnerabilities could lead to life threatening consequences like poorly performed surgeries, malfunctioning of life support systems or incorrect treatment due to data breache s . The ITU IMT - 2030 framework envisions that 6G will be transforming healthcare through massive connectivity, AI, and cloud integration. However, it may also introduce new security vulnerabilities that can threaten the patient safety and privacy. Therefore, a ddressing these threats requires a thor ough reassessment of security measures . This paper presents an innovative use of explainable AI (XAI) techniques - such as SHAP, LIME, and DiCE - to identify vulnerabilities, strengthen security measures, and enhance both security and transparency within the 6G healthcare ecosystem, ensuring robust protection and trust . In addition to the theoretical background, this paper presents experimental analysis and the authors very positive findings.


Unveil Sources of Uncertainty: Feature Contribution to Conformal Prediction Intervals

arXiv.org Machine Learning

Cooperative game theory methods, notably Shapley values, have significantly enhanced machine learning (ML) interpretability. However, existing explainable AI (XAI) frameworks mainly attribute average model predictions, overlooking predictive uncertainty. This work addresses that gap by proposing a novel, model-agnostic uncertainty attribution (UA) method grounded in conformal prediction (CP). By defining cooperative games where CP interval properties-such as width and bounds-serve as value functions, we systematically attribute predictive uncertainty to input features. Extending beyond the traditional Shapley values, we use the richer class of Harsanyi allocations, and in particular the proportional Shapley values, which distribute attribution proportionally to feature importance. We propose a Monte Carlo approximation method with robust statistical guarantees to address computational feasibility, significantly improving runtime efficiency. Our comprehensive experiments on synthetic benchmarks and real-world datasets demonstrate the practical utility and interpretative depth of our approach. By combining cooperative game theory and conformal prediction, we offer a rigorous, flexible toolkit for understanding and communicating predictive uncertainty in high-stakes ML applications.


From What Ifs to Insights: Counterfactuals in Causal Inference vs. Explainable AI

arXiv.org Machine Learning

Counterfactuals play a pivotal role in the two distinct data science fields of causal inference (CI) and explainable artificial intelligence (XAI). While the core idea behind counterfactuals remains the same in both fields--the examination of what would have happened under different circumstances--there are key differences in how they are used and interpreted. We introduce a formal definition that encompasses the multi-faceted concept of the counterfactual in CI and XAI. We then discuss how counterfactuals are used, evaluated, generated, and operationalized in CI vs. XAI, highlighting conceptual and practical differences. By comparing and contrasting the two, we hope to identify opportunities for cross-fertilization across CI and XAI.