Explanation & Argumentation
AIhub monthly digest: April 2024 – explainable AI, access to compute, and noughts and crosses
Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we hear about the effect of computing resource on AI research, learn about creating explanations for AI-based decision-making systems, and find out about the moderating effect of instant runoff voting. In a series of interviews, we're chatting to some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. In our latest two interviews, we met Bálint Gyevnár and Mike Lee and asked about their work on different aspects of explainable AI. This month, our experts consider the debate around open vs closed science.
People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior
Gyevnar, Balint, Droop, Stephanie, Quillien, Tadeg, Cohen, Shay B., Bramley, Neil R., Lucas, Christopher G., Albrecht, Stefano V.
Cognitive science can help us understand which explanations people might expect, and in which format they frame these explanations, whether causal, counterfactual, or teleological (i.e., purpose-oriented). Understanding the relevance of these concepts is crucial for building good explainable AI (XAI) which offers recourse and actionability. Focusing on autonomous driving, a complex decision-making domain, we report empirical data from two surveys on (i) how people explain the behavior of autonomous vehicles in 14 unique scenarios (N1=54), and (ii) how they perceive these explanations in terms of complexity, quality, and trustworthiness (N2=356). Participants deemed teleological explanations significantly better quality than counterfactual ones, with perceived teleology being the best predictor of perceived quality and trustworthiness. Neither the perceived teleology nor the quality were affected by whether the car was an autonomous vehicle or driven by a person. This indicates that people use teleology to evaluate information about not just other people but also autonomous vehicles. Taken together, our findings highlight the importance of explanations that are framed in terms of purpose rather than just, as is standard in XAI, the causal mechanisms involved. We release the 14 scenarios and more than 1,300 elicited explanations publicly as the Human Explanations for Autonomous Driving Decisions (HEADD) dataset.
QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations
Duell, Jamie, Seisenberger, Monika, Fu, Hsuan, Fan, Xiuyi
Given the prevelance of big data and increased computability, the application of Deep Neural Network (DNN) methods are a commonality. However, the intricacies and depth of DNN architectures lead to results that lack inherent interpretability. In pivotal domains such as healthcare and finance, interpretability is crucial and thus the application of eXplainable Artificial Intelligence (XAI) to extract valuable insights from the DNN models is widespread [1, 2]. The Path-Integrated Gradients (Path-IG) [3] formulation presents axiomatic properties that are upheld solely by pathbased explanation methods. The Out-of-Distribution (OoD) problem is prevalent in the application of path-based explanation methods [4]; here the intuition is that traveling along a straight line path can incur irregular gradients and thus provide noisy attribution values [5]. Another known limitation of many Integrated Gradient (IG) [3] based approaches is the selection of a baseline reference; thus the Adversarial Gradient Integration (AGI) [6] method relaxes this constraint by generating baselines in adversarial classes. We note that AGI utilizes the path-based approach for generating counterfactual examples, and for this reason will be a primary baseline for our proposed method throughout this paper.
How explainable AI affects human performance: A systematic review of the behavioural consequences of saliency maps
Saliency maps can explain how deep neural networks classify images. But are they actually useful for humans? The present systematic review of 68 user studies found that while saliency maps can enhance human performance, null effects or even costs are quite common. To investigate what modulates these effects, the empirical outcomes were organised along several factors related to the human tasks, AI performance, XAI methods, images to be classified, human participants and comparison conditions. In image-focused tasks, benefits were less common than in AI-focused tasks, but the effects depended on the specific cognitive requirements. Moreover, benefits were usually restricted to incorrect AI predictions in AI-focused tasks but to correct ones in image-focused tasks. XAI-related factors had surprisingly little impact. The evidence was limited for image- and human-related factors and the effects were highly dependent on the comparison conditions. These findings may support the design of future user studies.
Fiper: a Visual-based Explanation Combining Rules and Feature Importance
Cappuccio, Eleonora, Fadda, Daniele, Lanzilotti, Rosa, Rinzivillo, Salvatore
Artificial Intelligence algorithms have now become pervasive in multiple high-stakes domains. However, their internal logic can be obscure to humans. Explainable Artificial Intelligence aims to design tools and techniques to illustrate the predictions of the so-called black-box algorithms. The Human-Computer Interaction community has long stressed the need for a more user-centered approach to Explainable AI. This approach can benefit from research in user interface, user experience, and visual analytics. This paper proposes a visual-based method to illustrate rules paired with feature importance. A user study with 15 participants was conducted comparing our visual method with the original output of the algorithm and textual representation to test its effectiveness with users.
Learning Actionable Counterfactual Explanations in Large State Spaces
Naggita, Keziah, Walter, Matthew R., Blum, Avrim
Counterfactual explanations (CFEs) are sets of actions that an agent with a negative classification could take to achieve a (desired) positive classification, for consequential decisions such as loan applications, hiring, admissions, etc. In this work, we consider settings where optimal CFEs correspond to solutions of weighted set cover problems. In particular, there is a collection of actions that agents can perform that each have their own cost and each provide the agent with different sets of capabilities. The agent wants to perform the cheapest subset of actions that together provide all the needed capabilities to achieve a positive classification. Since this is an NP-hard optimization problem, we are interested in the question: can we, from training data (instances of agents and their optimal CFEs) learn a CFE generator that will quickly provide optimal sets of actions for new agents? In this work, we provide a deep-network learning procedure that we show experimentally is able to achieve strong performance at this task. We consider several problem formulations, including formulations in which the underlying "capabilities" and effects of actions are not explicitly provided, and so there is an informational challenge in addition to the computational challenge. Our problem can also be viewed as one of learning an optimal policy in a family of large but deterministic Markov Decision Processes (MDPs).
SIDEs: Separating Idealization from Deceptive Explanations in xAI
Explainable AI (xAI) methods are important for establishing trust in using black-box models. However, recent criticism has mounted against current xAI methods that they disagree, are necessarily false, and can be manipulated, which has started to undermine the deployment of black-box models. Rudin (2019) goes so far as to say that we should stop using black-box models altogether in high-stakes cases because xAI explanations "must be wrong". However, strict fidelity to the truth is historically not a desideratum in science. Idealizations -- the intentional distortions introduced to scientific theories and models -- are commonplace in the natural sciences and are seen as a successful scientific tool. Thus, it is not falsehood qua falsehood that is the issue. In this paper, I outline the need for xAI research to engage in idealization evaluation. Drawing on the use of idealizations in the natural sciences and philosophy of science, I introduce a novel framework for evaluating whether xAI methods engage in successful idealizations or deceptive explanations (SIDEs). SIDEs evaluates whether the limitations of xAI methods, and the distortions that they introduce, can be part of a successful idealization or are indeed deceptive distortions as critics suggest. I discuss the role that existing research can play in idealization evaluation and where innovation is necessary. Through a qualitative analysis we find that leading feature importance methods and counterfactual explanations are subject to idealization failure and suggest remedies for ameliorating idealization failure.
Explainable AI models for predicting liquefaction-induced lateral spreading
Hsiao, Cheng-Hsi, Kumar, Krishna, Rathje, Ellen
Earthquake-induced liquefaction can cause substantial lateral spreading, posing threats to infrastructure. Machine learning (ML) can improve lateral spreading prediction models by capturing complex soil characteristics and site conditions. However, the "black box" nature of ML models can hinder their adoption in critical decision-making. This study addresses this limitation by using SHapley Additive exPlanations (SHAP) to interpret an eXtreme Gradient Boosting (XGB) model for lateral spreading prediction, trained on data from the 2011 Christchurch Earthquake. SHAP analysis reveals the factors driving the model's predictions, enhancing transparency and allowing for comparison with established engineering knowledge. The results demonstrate that the XGB model successfully identifies the importance of soil characteristics derived from Cone Penetration Test (CPT) data in predicting lateral spreading, validating its alignment with domain understanding. This work highlights the value of explainable machine learning for reliable and informed decision-making in geotechnical engineering and hazard assessment.
MiMICRI: Towards Domain-centered Counterfactual Explanations of Cardiovascular Image Classification Models
Guo, Grace, Deng, Lifu, Tandon, Animesh, Endert, Alex, Kwon, Bum Chul
The recent prevalence of publicly accessible, large medical imaging datasets has led to a proliferation of artificial intelligence (AI) models for cardiovascular image classification and analysis. At the same time, the potentially significant impacts of these models have motivated the development of a range of explainable AI (XAI) methods that aim to explain model predictions given certain image inputs. However, many of these methods are not developed or evaluated with domain experts, and explanations are not contextualized in terms of medical expertise or domain knowledge. In this paper, we propose a novel framework and python library, MiMICRI, that provides domain-centered counterfactual explanations of cardiovascular image classification models. MiMICRI helps users interactively select and replace segments of medical images that correspond to morphological structures. From the counterfactuals generated, users can then assess the influence of each segment on model predictions, and validate the model against known medical facts. We evaluate this library with two medical experts. Our evaluation demonstrates that a domain-centered XAI approach can enhance the interpretability of model explanations, and help experts reason about models in terms of relevant domain knowledge. However, concerns were also surfaced about the clinical plausibility of the counterfactuals generated. We conclude with a discussion on the generalizability and trustworthiness of the MiMICRI framework, as well as the implications of our findings on the development of domain-centered XAI methods for model interpretability in healthcare contexts.
Debiasing Machine Unlearning with Counterfactual Examples
Chen, Ziheng, Wang, Jia, Zhuang, Jun, Reddy, Abbavaram Gowtham, Silvestri, Fabrizio, Huang, Jin, Nag, Kaushiki, Kuang, Kun, Ning, Xin, Tolomei, Gabriele
The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1) data-level bias, characterized by uneven data removal, and (2) algorithm-level bias, which leads to the contamination of the remaining dataset, thereby degrading model accuracy. In this work, we analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels. Typically, we introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset. Besides, we guide the forgetting procedure by leveraging counterfactual examples, as they maintain semantic data consistency without hurting performance on the remaining dataset. Experimental results demonstrate that our method outperforms existing machine unlearning baselines on evaluation metrics.