Goto

Collaborating Authors

 Explanation & Argumentation


Example-based Explanations for Random Forests using Machine Unlearning

arXiv.org Artificial Intelligence

Tree-based machine learning models, such as decision trees and random forests, have been hugely successful in classification tasks primarily because of their predictive power in supervised learning tasks and ease of interpretation. Despite their popularity and power, these models have been found to produce unexpected or discriminatory outcomes. Given their overwhelming success for most tasks, it is of interest to identify sources of their unexpected and discriminatory behavior. However, there has not been much work on understanding and debugging tree-based classifiers in the context of fairness. We introduce FairDebugger, a system that utilizes recent advances in machine unlearning research to identify training data subsets responsible for instances of fairness violations in the outcomes of a random forest classifier. FairDebugger generates top-$k$ explanations (in the form of coherent training data subsets) for model unfairness. Toward this goal, FairDebugger first utilizes machine unlearning to estimate the change in the tree structures of the random forest when parts of the underlying training data are removed, and then leverages the Apriori algorithm from frequent itemset mining to reduce the subset search space. We empirically evaluate our approach on three real-world datasets, and demonstrate that the explanations generated by FairDebugger are consistent with insights from prior studies on these datasets.


Explaining Learned Reward Functions with Counterfactual Trajectories

arXiv.org Artificial Intelligence

Learning rewards from human behaviour or feedback is a promising approach to aligning AI systems with human values but fails to consistently extract correct reward functions. Interpretability tools could enable users to understand and evaluate possible flaws in learned reward functions. We propose Counterfactual Trajectory Explanations (CTEs) to interpret reward functions in reinforcement learning by contrasting an original with a counterfactual partial trajectory and the rewards they each receive. We derive six quality criteria for CTEs and propose a novel Monte-Carlo-based algorithm for generating CTEs that optimises these quality criteria. Finally, we measure how informative the generated explanations are to a proxy-human model by training it on CTEs. CTEs are demonstrably informative for the proxy-human model, increasing the similarity between its predictions and the reward function on unseen trajectories. Further, it learns to accurately judge differences in rewards between trajectories and generalises to out-of-distribution examples. Although CTEs do not lead to a perfect understanding of the reward, our method, and more generally the adaptation of XAI methods, are presented as a fruitful approach for interpreting learned reward functions.


XAI-CF -- Examining the Role of Explainable Artificial Intelligence in Cyber Forensics

arXiv.org Artificial Intelligence

With the rise of complex cyber devices Cyber Forensics (CF) is facing many new challenges. For example, there are dozens of systems running on smartphones, each with more than millions of downloadable applications. Sifting through this large amount of data and making sense requires new techniques, such as from the field of Artificial Intelligence (AI). To apply these techniques successfully in CF, we need to justify and explain the results to the stakeholders of CF, such as forensic analysts and members of the court, for them to make an informed decision. If we want to apply AI successfully in CF, there is a need to develop trust in AI systems. Some other factors in accepting the use of AI in CF are to make AI authentic, interpretable, understandable, and interactive. This way, AI systems will be more acceptable to the public and ensure alignment with legal standards. An explainable AI (XAI) system can play this role in CF, and we call such a system XAI-CF. XAI-CF is indispensable and is still in its infancy. In this paper, we explore and make a case for the significance and advantages of XAI-CF. We strongly emphasize the need to build a successful and practical XAI-CF system and discuss some of the main requirements and prerequisites of such a system. We present a formal definition of the terms CF and XAI-CF and a comprehensive literature review of previous works that apply and utilize XAI to build and increase trust in CF. We discuss some challenges facing XAI-CF. We also provide some concrete solutions to these challenges. We identify key insights and future research directions for building XAI applications for CF. This paper is an effort to explore and familiarize the readers with the role of XAI applications in CF, and we believe that our work provides a promising basis for future researchers interested in XAI-CF.


A Critical Survey on Fairness Benefits of XAI

arXiv.org Artificial Intelligence

In this critical survey, we analyze typical claims on the relationship between explainable AI (XAI) and fairness to disentangle the multidimensional relationship between these two concepts. Based on a systematic literature review and a subsequent qualitative content analysis, we identify seven archetypal claims from 175 papers on the alleged fairness benefits of XAI. We present crucial caveats with respect to these claims and provide an entry point for future discussions around the potentials and limitations of XAI for specific fairness desiderata. Importantly, we notice that claims are often (i) vague and simplistic, (ii) lacking normative grounding, or (iii) poorly aligned with the actual capabilities of XAI. We encourage to conceive XAI not as an ethical panacea but as one of many tools to approach the multidimensional, sociotechnical challenge of algorithmic fairness. Moreover, when making a claim about XAI and fairness, we emphasize the need to be more specific about what kind of XAI method is used and which fairness desideratum it refers to, how exactly it enables fairness, and who is the stakeholder that benefits from XAI.


Counterfactual Generation with Answer Set Programming

arXiv.org Artificial Intelligence

Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail approval, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might also desire explanations to understand why a decision was made. Ethical and legal considerations may further require informing the individual of changes in the input attribute that could be made to produce a desirable outcome. This paper focuses on the latter problem of automatically generating counterfactual explanations. We propose a framework Counterfactual Generation with s(CASP) (CFGS) that utilizes answer set programming (ASP) and the s(CASP) goal-directed ASP system to automatically generate counterfactual explanations from rules generated by rule-based machine learning (RBML) algorithms. In our framework, we show how counterfactual explanations are computed and justified by imagining worlds where some or all factual assumptions are altered/changed. More importantly, we show how we can navigate between these worlds, namely, go from our original world/scenario where we obtain an undesired outcome to the imagined world/scenario where we obtain a desired/favourable outcome.


Collective Counterfactual Explanations via Optimal Transport

arXiv.org Artificial Intelligence

Counterfactual explanations provide individuals with cost-optimal actions that can alter their labels to desired classes. However, if substantial instances seek state modification, such individual-centric methods can lead to new competitions and unanticipated costs. Furthermore, these recommendations, disregarding the underlying data distribution, may suggest actions that users perceive as outliers. To address these issues, our work proposes a collective approach for formulating counterfactual explanations, with an emphasis on utilizing the current density of the individuals to inform the recommended actions. Our problem naturally casts as an optimal transport problem. Leveraging the extensive literature on optimal transport, we illustrate how this collective method improves upon the desiderata of classical counterfactual explanations. We support our proposal with numerical simulations, illustrating the effectiveness of the proposed approach and its relation to classic methods.


Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models

arXiv.org Artificial Intelligence

This paper introduces a system designed to generate explanations for the actions performed by an autonomous robot in Human-Robot Interaction (HRI). Explainability in robotics, encapsulated within the concept of an eXplainable Autonomous Robot (XAR), is a growing research area. The work described in this paper aims to take advantage of the capabilities of Large Language Models (LLMs) in performing natural language processing tasks. This study focuses on the possibility of generating explanations using such models in combination with a Retrieval Augmented Generation (RAG) method to interpret data gathered from the logs of autonomous systems. In addition, this work also presents a formalization of the proposed explanation system. It has been evaluated through a navigation test from the European Robotics League (ERL), a Europe-wide social robotics competition. Regarding the obtained results, a validation questionnaire has been conducted to measure the quality of the explanations from the perspective of technical users. The results obtained during the experiment highlight the potential utility of LLMs in achieving explanatory capabilities in robots.


Abstracted Trajectory Visualization for Explainability in Reinforcement Learning

arXiv.org Artificial Intelligence

Explainable AI (XAI) has demonstrated the potential to help reinforcement learning (RL) practitioners to understand how RL models work. However, XAI for users who do not have RL expertise (non-RL experts), has not been studied sufficiently. This results in a difficulty for the non-RL experts to participate in the fundamental discussion of how RL models should be designed for an incoming society where humans and AI coexist. Solving such a problem would enable RL experts to communicate with the non-RL experts in producing machine learning solutions that better fit our society. We argue that abstracted trajectories, that depicts transitions between the major states of the RL model, will be useful for non-RL experts to build a mental model of the agents. Our early results suggest that by leveraging a visualization of the abstracted trajectories, users without RL expertise are able to infer the behavior patterns of RL.


Statistics without Interpretation: A Sober Look at Explainable Machine Learning

arXiv.org Artificial Intelligence

In the rapidly growing literature on explanation algorithms, it often remains unclear what precisely these algorithms are for and how they should be used. We argue that this is because explanation algorithms are often mathematically complex but don't admit a clear interpretation. Unfortunately, complex statistical methods that don't have a clear interpretation are bound to lead to errors in interpretation, a fact that has become increasingly apparent in the literature. In order to move forward, papers on explanation algorithms should make clear how precisely the output of the algorithms should be interpreted. They should also clarify what questions about the function can and cannot be answered given the explanations. Our argument is based on the distinction between statistics and their interpretation. It also relies on parallels between explainable machine learning and applied statistics.


Counterfactual Explanations of Black-box Machine Learning Models using Causal Discovery with Applications to Credit Rating

arXiv.org Artificial Intelligence

Explainable artificial intelligence (XAI) has helped elucidate the internal mechanisms of machine learning algorithms, bolstering their reliability by demonstrating the basis of their predictions. Several XAI models consider causal relationships to explain models by examining the input-output relationships of prediction models and the dependencies between features. The majority of these models have been based their explanations on counterfactual probabilities, assuming that the causal graph is known. However, this assumption complicates the application of such models to real data, given that the causal relationships between features are unknown in most cases. Thus, this study proposed a novel XAI framework that relaxed the constraint that the causal graph is known. This framework leveraged counterfactual probabilities and additional prior information on causal structure, facilitating the integration of a causal graph estimated through causal discovery methods and a black-box classification model. Furthermore, explanatory scores were estimated based on counterfactual probabilities. Numerical experiments conducted employing artificial data confirmed the possibility of estimating the explanatory score more accurately than in the absence of a causal graph. Finally, as an application to real data, we constructed a classification model of credit ratings assigned by Shiga Bank, Shiga prefecture, Japan. We demonstrated the effectiveness of the proposed method in cases where the causal graph is unknown.