Explanation & Argumentation
Should XAI Nudge Human Decisions with Explanation Biasing?
Fukuchi, Yosuke, Yamada, Seiji
This paper reviews our previous trials of Nudge-XAI, an approach that introduces automatic biases into explanations from explainable AIs (XAIs) with the aim of leading users to better decisions, and it discusses the benefits and challenges. Nudge-XAI uses a user model that predicts the influence of providing an explanation or emphasizing it and attempts to guide users toward AI-suggested decisions without coercion. The nudge design is expected to enhance the autonomy of users, reduce the risk associated with an AI making decisions without users' full agreement, and enable users to avoid AI failures. To discuss the potential of Nudge-XAI, this paper reports a post-hoc investigation of previous experimental results using cluster analysis. The results demonstrate the diversity of user behavior in response to Nudge-XAI, which supports our aim of enhancing user autonomy. However, it also highlights the challenge of users who distrust AI and falsely make decisions contrary to AI suggestions, suggesting the need for personalized adjustment of the strength of nudges to make this approach work more generally.
Are Objective Explanatory Evaluation metrics Trustworthy? An Adversarial Analysis
Chowdhury, Prithwijit, Prabhushankar, Mohit, AlRegib, Ghassan, Deriche, Mohamed
Explainable AI (XAI) has revolutionized the field of deep learning by empowering users to have more trust in neural network models. The field of XAI allows users to probe the inner workings of these algorithms to elucidate their decision-making processes. The rise in popularity of XAI has led to the advent of different strategies to produce explanations, all of which only occasionally agree. Thus several objective evaluation metrics have been devised to decide which of these modules give the best explanation for specific scenarios. The goal of the paper is twofold: (i) we employ the notions of necessity and sufficiency from causal literature to come up with a novel explanatory technique called SHifted Adversaries using Pixel Elimination(SHAPE) which satisfies all the theoretical and mathematical criteria of being a valid explanation, (ii) we show that SHAPE is, infact, an adversarial explanation that fools causal metrics that are employed to measure the robustness and reliability of popular importance based visual XAI methods. Our analysis shows that SHAPE outperforms popular explanatory techniques like GradCAM and GradCAM++ in these tests and is comparable to RISE, raising questions about the sanity of these metrics and the need for human involvement for an overall better evaluation.
Evolutionary Computation and Explainable AI: A Roadmap to Transparent Intelligent Systems
Zhou, Ryan, Bacardit, Jaume, Brownlee, Alexander, Cagnoni, Stefano, Fyvie, Martin, Iacca, Giovanni, McCall, John, van Stein, Niki, Walker, David, Hu, Ting
AI methods are finding an increasing number of applications, but their often black-box nature has raised concerns about accountability and trust. The field of explainable artificial intelligence (XAI) has emerged in response to the need for human understanding of AI models. Evolutionary computation (EC), as a family of powerful optimization and learning tools, has significant potential to contribute to XAI. In this paper, we provide an introduction to XAI and review various techniques in current use for explaining machine learning (ML) models. We then focus on how EC can be used in XAI, and review some XAI approaches which incorporate EC techniques. Additionally, we discuss the application of XAI principles within EC itself, examining how these principles can shed some light on the behavior and outcomes of EC algorithms in general, on the (automatic) configuration of these algorithms, and on the underlying problem landscapes that these algorithms optimize. Finally, we discuss some open challenges in XAI and opportunities for future research in this field using EC. Our aim is to demonstrate that EC is well-suited for addressing current problems in explainability and to encourage further exploration of these methods to contribute to the development of more transparent and trustworthy ML models and EC algorithms.
AI-Driven Predictive Analytics Approach for Early Prognosis of Chronic Kidney Disease Using Ensemble Learning and Explainable AI
Jawad, K M Tawsik, Verma, Anusha, Amsaad, Fathi
Chronic Kidney Disease (CKD) is one of the widespread Chronic diseases with no known ultimo cure and high morbidity. Research demonstrates that progressive Chronic Kidney Disease (CKD) is a heterogeneous disorder that significantly impacts kidney structure and functions, eventually leading to kidney failure. With the progression of time, chronic kidney disease has moved from a life-threatening disease affecting few people to a common disorder of varying severity. The goal of this research is to visualize dominating features, feature scores, and values exhibited for early prognosis and detection of CKD using ensemble learning and explainable AI. For that, an AI-driven predictive analytics approach is proposed to aid clinical practitioners in prescribing lifestyle modifications for individual patients to reduce the rate of progression of this disease. Our dataset is collected on body vitals from individuals with CKD and healthy subjects to develop our proposed AI-driven solution accurately. In this regard, blood and urine test results are provided, and ensemble tree-based machine-learning models are applied to predict unseen cases of CKD. Our research findings are validated after lengthy consultations with nephrologists. Our experiments and interpretation results are compared with existing explainable AI applications in various healthcare domains, including CKD. The comparison shows that our developed AI models, particularly the Random Forest model, have identified more features as significant contributors than XgBoost. Interpretability (I), which measures the ratio of important to masked features, indicates that our XgBoost model achieved a higher score, specifically a Fidelity of 98\%, in this metric and naturally in the FII index compared to competing models.
Provably Better Explanations with Optimized Aggregation of Feature Attributions
Decker, Thomas, Bhattarai, Ananta R., Gu, Jindong, Tresp, Volker, Buettner, Florian
Using feature attributions for post-hoc explanations is a common practice to understand and verify the predictions of opaque machine learning models. Despite the numerous techniques available, individual methods often produce inconsistent and unstable results, putting their overall reliability into question. In this work, we aim to systematically improve the quality of feature attributions by combining multiple explanations across distinct methods or their variations. For this purpose, we propose a novel approach to derive optimal convex combinations of feature attributions that yield provable improvements of desired quality criteria such as robustness or faithfulness to the model behavior. Through extensive experiments involving various model architectures and popular feature attribution techniques, we demonstrate that our combination strategy consistently outperforms individual methods and existing baselines.
Contrastive explainable clustering with differential privacy
Nguyen, Dung, Vetzler, Ariel, Kraus, Sarit, Vullikanti, Anil
This paper presents a novel approach in Explainable AI (XAI), integrating contrastive explanations with differential privacy in clustering methods. For several basic clustering problems, including $k$-median and $k$-means, we give efficient differential private contrastive explanations that achieve essentially the same explanations as those that non-private clustering explanations can obtain. We define contrastive explanations as the utility difference between the original clustering utility and utility from clustering with a specifically fixed centroid. In each contrastive scenario, we designate a specific data point as the fixed centroid position, enabling us to measure the impact of this constraint on clustering utility under differential privacy. Extensive experiments across various datasets show our method's effectiveness in providing meaningful explanations without significantly compromising data privacy or clustering utility. This underscores our contribution to privacy-aware machine learning, demonstrating the feasibility of achieving a balance between privacy and utility in the explanation of clustering tasks.
Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
Yadav, Neemesh, Masud, Sarah, Goyal, Vikram, Goyal, Vikram, Akhtar, Md Shad, Chakraborty, Tanmoy
Employing language models to generate explanations for an incoming implicit hate post is an active area of research. The explanation is intended to make explicit the underlying stereotype and aid content moderators. The training often combines top-k relevant knowledge graph (KG) tuples to provide world knowledge and improve performance on standard metrics. Interestingly, our study presents conflicting evidence for the role of the quality of KG tuples in generating implicit explanations. Consequently, simpler models incorporating external toxicity signals outperform KG-infused models. Compared to the KG-based setup, we observe a comparable performance for SBIC (LatentHatred) datasets with a performance variation of +0.44 (+0.49), +1.83 (-1.56), and -4.59 (+0.77) in BLEU, ROUGE-L, and BERTScore. Further human evaluation and error analysis reveal that our proposed setup produces more precise explanations than zero-shot GPT-3.5, highlighting the intricate nature of the task.
Analyzing the Influence of Training Samples on Explanations
Artelt, André, Hammer, Barbara
EXplainable AI (XAI) constitutes a popular method to analyze the reasoning of AI systems by explaining their decision-making, e.g. providing a counterfactual explanation of how to achieve recourse. However, in cases such as unexpected explanations, the user might be interested in learning about the cause of this explanation -- e.g. properties of the utilized training data that are responsible for the observed explanation. Under the umbrella of data valuation, first approaches have been proposed that estimate the influence of data samples on a given model. In this work, we take a slightly different stance, as we are interested in the influence of single samples on a model explanation rather than the model itself. Hence, we propose the novel problem of identifying training data samples that have a high influence on a given explanation (or related quantity) and investigate the particular case of differences in the cost of the recourse between protected groups. For this, we propose an algorithm that identifies such influential training samples.
Logic-Based Explainability: Past, Present & Future
In recent years, the impact of machine learning (ML) and artificial intelligence (AI) in society has been absolutely remarkable. This impact is expected to continue in the foreseeable future. However,the adoption of AI/ML is also a cause of grave concern. The operation of the most advances AI/ML models is often beyond the grasp of human decision makers. As a result, decisions that impact humans may not be understood and may lack rigorous validation. Explainable AI (XAI) is concerned with providing human decision-makers with understandable explanations for the predictions made by ML models. As a result, XAI is a cornerstone of trustworthy AI. Despite its strategic importance, most work on XAI lacks rigor, and so its use in high-risk or safety-critical domains serves to foster distrust instead of contributing to build much-needed trust. Logic-based XAI has recently emerged as a rigorous alternative to those other non-rigorous methods of XAI. This paper provides a technical survey of logic-based XAI, its origins, the current topics of research, and emerging future topics of research. The paper also highlights the many myths that pervade non-rigorous approaches for XAI.
CoLa-DCE -- Concept-guided Latent Diffusion Counterfactual Explanations
Motzkus, Franz, Hellert, Christian, Schmid, Ute
Recent advancements in generative AI have introduced novel prospects and practical implementations. Especially diffusion models show their strength in generating diverse and, at the same time, realistic features, positioning them well for generating counterfactual explanations for computer vision models. Answering "what if" questions of what needs to change to make an image classifier change its prediction, counterfactual explanations align well with human understanding and consequently help in making model behavior more comprehensible. Current methods succeed in generating authentic counterfactuals, but lack transparency as feature changes are not directly perceivable. To address this limitation, we introduce Concept-guided Latent Diffusion Counterfactual Explanations (CoLa-DCE). CoLa-DCE generates concept-guided counterfactuals for any classifier with a high degree of control regarding concept selection and spatial conditioning. The counterfactuals comprise an increased granularity through minimal feature changes. The reference feature visualization ensures better comprehensibility, while the feature localization provides increased transparency of "where" changed "what". We demonstrate the advantages of our approach in minimality and comprehensibility across multiple image classification models and datasets and provide insights into how our CoLa-DCE explanations help comprehend model errors like misclassification cases.