Gyevnar, Balint
Objective Metrics for Human-Subjects Evaluation in Explainable Reinforcement Learning
Gyevnar, Balint, Towers, Mark
Explanation is a fundamentally human process. Understanding the goal and audience of the explanation is vital, yet existing work on explainable reinforcement learning (XRL) routinely does not consult humans in their evaluations. Even when they do, they routinely resort to subjective metrics, such as confidence or understanding, that can only inform researchers of users' opinions, not their practical effectiveness for a given problem. This paper calls on researchers to use objective human metrics for explanation evaluations based on observable and actionable behaviour to build more reproducible, comparable, and epistemically grounded research. To this end, we curate, describe, and compare several objective evaluation methodologies for applying explanations to debugging agent behaviour and supporting human-agent teaming, illustrating our proposed methods using a novel grid-based environment. We discuss how subjective and objective metrics complement each other to provide holistic validation and how future work needs to utilise standardised benchmarks for testing to enable greater comparisons between research.
People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior
Gyevnar, Balint, Droop, Stephanie, Quillien, Tadeg, Cohen, Shay B., Bramley, Neil R., Lucas, Christopher G., Albrecht, Stefano V.
Cognitive science can help us understand which explanations people might expect, and in which format they frame these explanations, whether causal, counterfactual, or teleological (i.e., purpose-oriented). Understanding the relevance of these concepts is crucial for building good explainable AI (XAI) which offers recourse and actionability. Focusing on autonomous driving, a complex decision-making domain, we report empirical data from two surveys on (i) how people explain the behavior of autonomous vehicles in 14 unique scenarios (N1=54), and (ii) how they perceive these explanations in terms of complexity, quality, and trustworthiness (N2=356). Participants deemed teleological explanations significantly better quality than counterfactual ones, with perceived teleology being the best predictor of perceived quality and trustworthiness. Neither the perceived teleology nor the quality were affected by whether the car was an autonomous vehicle or driven by a person. This indicates that people use teleology to evaluate information about not just other people but also autonomous vehicles. Taken together, our findings highlight the importance of explanations that are framed in terms of purpose rather than just, as is standard in XAI, the causal mechanisms involved. We release the 14 scenarios and more than 1,300 elicited explanations publicly as the Human Explanations for Autonomous Driving Decisions (HEADD) dataset.
Explainable AI for Safe and Trustworthy Autonomous Driving: A Systematic Review
Kuznietsov, Anton, Gyevnar, Balint, Wang, Cheng, Peters, Steven, Albrecht, Stefano V.
Artificial Intelligence (AI) shows promising applications for the perception and planning tasks in autonomous driving (AD) due to its superior performance compared to conventional methods. However, inscrutable AI systems exacerbate the existing challenge of safety assurance of AD. One way to mitigate this challenge is to utilize explainable AI (XAI) techniques. To this end, we present the first comprehensive systematic literature review of explainable methods for safe and trustworthy AD. We begin by analyzing the requirements for AI in the context of AD, focusing on three key aspects: data, model, and agency. We find that XAI is fundamental to meeting these requirements. Based on this, we explain the sources of explanations in AI and describe a taxonomy of XAI. We then identify five key contributions of XAI for safe and trustworthy AI in AD, which are interpretable design, interpretable surrogate models, interpretable monitoring, auxiliary explanations, and interpretable validation. Finally, we propose a modular framework called SafeX to integrate these contributions, enabling explanation delivery to users while simultaneously ensuring the safety of AI models.
Causal Explanations for Sequential Decision-Making in Multi-Agent Systems
Gyevnar, Balint, Wang, Cheng, Lucas, Christopher G., Cohen, Shay B., Albrecht, Stefano V.
We present CEMA: Causal Explanations in Multi-Agent systems; a general framework to create causal explanations for an agent's decisions in sequential multi-agent systems. The core of CEMA is a novel causal selection method inspired by how humans select causes for explanations. Unlike prior work that assumes a specific causal structure, CEMA is applicable whenever a probabilistic model for predicting future states of the environment is available. Given such a model, CEMA samples counterfactual worlds that inform us about the salient causes behind the agent's decisions. We evaluate CEMA on the task of motion planning for autonomous driving and test it in diverse simulated scenarios. We show that CEMA correctly and robustly identifies the causes behind decisions, even when a large number of agents is present, and show via a user study that CEMA's explanations have a positive effect on participant's trust in AVs and are rated at least as good as high-quality human explanations elicited from other participants.
Bridging the Transparency Gap: What Can Explainable AI Learn From the AI Act?
Gyevnar, Balint, Ferguson, Nick, Schafer, Burkhard
The European Union has proposed the Artificial Intelligence Act which introduces detailed requirements of transparency for AI systems. Many of these requirements can be addressed by the field of explainable AI (XAI), however, there is a fundamental difference between XAI and the Act regarding what transparency is. The Act views transparency as a means that supports wider values, such as accountability, human rights, and sustainable innovation. In contrast, XAI views transparency narrowly as an end in itself, focusing on explaining complex algorithmic properties without considering the socio-technical context. We call this difference the ``transparency gap''. Failing to address the transparency gap, XAI risks leaving a range of transparency issues unaddressed. To begin to bridge this gap, we overview and clarify the terminology of how XAI and European regulation -- the Act and the related General Data Protection Regulation (GDPR) -- view basic definitions of transparency. By comparing the disparate views of XAI and regulation, we arrive at four axes where practical work could bridge the transparency gap: defining the scope of transparency, clarifying the legal status of XAI, addressing issues with conformity assessment, and building explainability for datasets.