Explanation & Argumentation
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods
Hesse, Robin, Schaub-Meyer, Simone, Roth, Stefan
The field of explainable artificial intelligence (XAI) aims to uncover the inner workings of complex deep neural models. While being crucial for safety-critical domains, XAI inherently lacks ground-truth explanations, making its automatic evaluation an unsolved problem. We address this challenge by proposing a novel synthetic vision dataset, named FunnyBirds, and accompanying automatic evaluation protocols. Our dataset allows performing semantically meaningful image interventions, e.g., removing individual object parts, which has three important implications. First, it enables analyzing explanations on a part level, which is closer to human comprehension than existing methods that evaluate on a pixel level. Second, by comparing the model output for inputs with removed parts, we can estimate ground-truth part importances that should be reflected in the explanations. Third, by mapping individual explanations into a common space of part importances, we can analyze a variety of different explanation types in a single common framework. Using our tools, we report results for 24 different combinations of neural models and XAI methods, demonstrating the strengths and weaknesses of the assessed methods in a fully automatic and systematic manner.
Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation
Vaeth, Philipp, Fruehwald, Alexander M., Paassen, Benjamin, Gregorova, Magda
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality. However, it is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies. In this work, we propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used. We use this framework to explore the effects of certain crucial design choices in the latest diffusion-based generative models for VCEs of natural image classification (ImageNet). We conduct a battery of ablation-like experiments, generating thousands of VCEs for a suite of classifiers of various complexity, accuracy and robustness. Our findings suggest multiple directions for future advancements and improvements of VCE methods. By sharing our methodology and our approach to tackle the computational challenges of such a study on a limited hardware setup (including the complete code base), we offer a valuable guidance for researchers in the field fostering consistency and transparency in the assessment of counterfactual explanations.
An Autoethnographic Exploration of XAI in Algorithmic Composition
Noel-Hirst, Ashley, Bryan-Kinns, Nick
Machine Learning models are capable of generating complex music across a range of genres from folk to classical music. However, current generative music AI models are typically difficult to understand and control in meaningful ways. Whilst research has started to explore how explainable AI (XAI) generative models might be created for music, no generative XAI models have been studied in music making practice. This paper introduces an autoethnographic study of the use of the MeasureVAE generative music XAI model with interpretable latent dimensions trained on Irish folk music. Findings suggest that the exploratory nature of the music-making workflow foregrounds musical features of the training dataset rather than features of the generative model itself. The appropriation of an XAI model within an iterative workflow highlights the potential of XAI models to form part of a richer and more complex workflow than they were initially designed for.
Exploring XAI for the Arts: Explaining Latent Space in Generative Music
Bryan-Kinns, Nick, Banar, Berker, Ford, Corey, Reed, Courtney N., Zhang, Yixiao, Colton, Simon, Armitage, Jack
Explainable AI has the potential to support more interactive and fluid co-creative AI systems which can creatively collaborate with people. To do this, creative AI models need to be amenable to debugging by offering eXplainable AI (XAI) features which are inspectable, understandable, and modifiable. However, currently there is very little XAI for the arts. In this work, we demonstrate how a latent variable model for music generation can be made more explainable; specifically we extend MeasureVAE which generates measures of music. We increase the explainability of the model by: i) using latent space regularisation to force some specific dimensions of the latent space to map to meaningful musical attributes, ii) providing a user interface feedback loop to allow people to adjust dimensions of the latent space and observe the results of these changes in real-time, iii) providing a visualisation of the musical attributes in the latent space to help people understand and predict the effect of changes to latent space dimensions. We suggest that in doing so we bridge the gap between the latent space and the generated musical outcomes in a meaningful way which makes the model and its outputs more explainable and more debuggable. The code repository can be found at: https://github.com/bbanar2/
Explainable AI applications in the Medical Domain: a systematic review
Prentzas, Nicoletta, Kakas, Antonis, Pattichis, Constantinos S.
Artificial Intelligence in Medicine has made significant progress with emerging applications in medical imaging, patient care, and other areas. While these applications have proven successful in retrospective studies, very few of them were applied in practice.The field of Medical AI faces various challenges, in terms of building user trust, complying with regulations, using data ethically.Explainable AI (XAI) aims to enable humans understand AI and trust its results. This paper presents a literature review on the recent developments of XAI solutions for medical decision support, based on a representative sample of 198 articles published in recent years. The systematic synthesis of the relevant articles resulted in several findings. (1) model-agnostic XAI techniques were mostly employed in these solutions, (2) deep learning models are utilized more than other types of machine learning models, (3) explainability was applied to promote trust, but very few works reported the physicians participation in the loop, (4) visual and interactive user interface is more useful in understanding the explanation and the recommendation of the system. More research is needed in collaboration between medical and AI experts, that could guide the development of suitable frameworks for the design, implementation, and evaluation of XAI solutions in medicine.
Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Collaborative Clinical Decision Making
Artificial intelligence (AI) is increasingly being considered to assist human decision-making in high-stake domains (e.g. health). However, researchers have discussed an issue that humans can over-rely on wrong suggestions of the AI model instead of achieving human AI complementary performance. In this work, we utilized salient feature explanations along with what-if, counterfactual explanations to make humans review AI suggestions more analytically to reduce overreliance on AI and explored the effect of these explanations on trust and reliance on AI during clinical decision-making. We conducted an experiment with seven therapists and ten laypersons on the task of assessing post-stroke survivors' quality of motion, and analyzed their performance, agreement level on the task, and reliance on AI without and with two types of AI explanations. Our results showed that the AI model with both salient features and counterfactual explanations assisted therapists and laypersons to improve their performance and agreement level on the task when `right' AI outputs are presented. While both therapists and laypersons over-relied on `wrong' AI outputs, counterfactual explanations assisted both therapists and laypersons to reduce their over-reliance on `wrong' AI outputs by 21\% compared to salient feature explanations. Specifically, laypersons had higher performance degrades by 18.0 f1-score with salient feature explanations and 14.0 f1-score with counterfactual explanations than therapists with performance degrades of 8.6 and 2.8 f1-scores respectively. Our work discusses the potential of counterfactual explanations to better estimate the accuracy of an AI model and reduce over-reliance on `wrong' AI outputs and implications for improving human-AI collaborative decision-making.
Some Options for Instantiation of Bipolar Argument Graphs with Deductive Arguments
Argument graphs provide an abstract representation of an argumentative situation. A bipolar argument graph is a directed graph where each node denotes an argument, and each arc denotes the influence of one argument on another. Here we assume that the influence is supporting, attacking, or ambiguous. In a bipolar argument graph, each argument is atomic and so it has no internal structure. Yet to better understand the nature of the individual arguments, and how they interact, it is important to consider their internal structure. To address this need, this paper presents a framework based on the use of logical arguments to instantiate bipolar argument graphs, and a set of possible constraints on instantiating arguments that take into account the internal structure of the arguments, and the types of relationship between arguments.
Inherently Interpretable Multi-Label Classification Using Class-Specific Counterfactuals
Sun, Susu, Woerner, Stefano, Maier, Andreas, Koch, Lisa M., Baumgartner, Christian F.
Interpretability is essential for machine learning algorithms in high-stakes application fields such as medical image analysis. However, high-performing black-box neural networks do not provide explanations for their predictions, which can lead to mistrust and suboptimal human-ML collaboration. Post-hoc explanation techniques, which are widely used in practice, have been shown to suffer from severe conceptual problems. Furthermore, as we show in this paper, current explanation techniques do not perform adequately in the multi-label scenario, in which multiple medical findings may co-occur in a single image. We propose Attri-Net, an inherently interpretable model for multi-label classification. Attri-Net is a powerful classifier that provides transparent, trustworthy, and human-understandable explanations. The model first generates class-specific attribution maps based on counterfactuals to identify which image regions correspond to certain medical findings. Then a simple logistic regression classifier is used to make predictions based solely on these attribution maps. We compare Attri-Net to five post-hoc explanation techniques and one inherently interpretable classifier on three chest X-ray datasets. We find that Attri-Net produces high-quality multi-label explanations consistent with clinical knowledge and has comparable classification performance to state-of-the-art classification models.
Feature Importance versus Feature Influence and What It Signifies for Explainable AI
When used in the context of decision theory, feature importance expresses how much changing the value of a feature can change the model outcome (or the utility of the outcome), compared to other features. Feature importance should not be confused with the feature influence used by most state-of-the-art post-hoc Explainable AI methods. Contrary to feature importance, feature influence is measured against a reference level or baseline. The Contextual Importance and Utility (CIU) method provides a unified definition of global and local feature importance that is applicable also for post-hoc explanations, where the value utility concept provides instance-level assessment of how favorable or not a feature value is for the outcome. The paper shows how CIU can be applied to both global and local explainability, assesses the fidelity and stability of different methods, and shows how explanations that use contextual importance and contextual utility can provide more expressive and flexible explanations than when using influence only.
Selective Explanations: Leveraging Human Input to Align Explainable AI
Lai, Vivian, Zhang, Yiming, Chen, Chacha, Liao, Q. Vera, Tan, Chenhao
While a vast collection of explainable AI (XAI) algorithms have been developed in recent years, they are often criticized for significant gaps with how humans produce and consume explanations. As a result, current XAI techniques are often found to be hard to use and lack effectiveness. In this work, we attempt to close these gaps by making AI explanations selective -- a fundamental property of human explanations -- by selectively presenting a subset from a large set of model reasons based on what aligns with the recipient's preferences. We propose a general framework for generating selective explanations by leveraging human input on a small sample. This framework opens up a rich design space that accounts for different selectivity goals, types of input, and more. As a showcase, we use a decision-support task to explore selective explanations based on what the decision-maker would consider relevant to the decision task. We conducted two experimental studies to examine three out of a broader possible set of paradigms based on our proposed framework: in Study 1, we ask the participants to provide their own input to generate selective explanations, with either open-ended or critique-based input. In Study 2, we show participants selective explanations based on input from a panel of similar users (annotators). Our experiments demonstrate the promise of selective explanations in reducing over-reliance on AI and improving decision outcomes and subjective perceptions of the AI, but also paint a nuanced picture that attributes some of these positive effects to the opportunity to provide one's own input to augment AI explanations. Overall, our work proposes a novel XAI framework inspired by human communication behaviors and demonstrates its potentials to encourage future work to better align AI explanations with human production and consumption of explanations.