Goto

Collaborating Authors

 explanation property


Directly Optimizing Explanations for Desired Properties

arXiv.org Artificial Intelligence

When explaining black-box machine learning models, it's often important for explanations to have certain desirable properties. Most existing methods `encourage' desirable properties in their construction of explanations. In this work, we demonstrate that these forms of encouragement do not consistently create explanations with the properties that are supposedly being targeted. Moreover, they do not allow for any control over which properties are prioritized when different properties are at odds with each other. We propose to directly optimize explanations for desired properties. Our direct approach not only produces explanations with optimal properties more consistently but also empowers users to control trade-offs between different properties, allowing them to create explanations with exactly what is needed for a particular task.


A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

arXiv.org Artificial Intelligence

In the context of human+AI interaction, explanations of the underlying function can provide additional information to assist the human in performing their task. Recent literature suggests that explanations with different properties are useful for different tasks [Liao et al., 2022, Lai et al., 2023, Chen et al., 2023, Jesus et al., 2021, Wang et al., 2019, Liao et al., 2020, Lim and Dey, 2009]. For example, in an AI-auditing task, the user may need to check whether the AI inappropriately relied on a forbidden feature, such as using gender in computing a credit score [Kaur et al., 2020, Hase and Bansal, 2020a, Lakkaraju et al., 2019]. In this case, we would want explanations that are faithful; that is, they reliably capture the underlying behavior of the function. On the other hand, suppose our goal is to help a user quickly understand the process by which a function produces its output; we can quantify the user's understanding by measuring the user's ability to approximate the function's output, given the input and an explanation [Hase and Bansal, 2020b, Chandrasekaran et al., 2018]. In this case, we may want explanations with low complexity, so that the user can effectively reason using the explanation in a limited amount of time.


Towards a Comprehensive Human-Centred Evaluation Framework for Explainable AI

arXiv.org Artificial Intelligence

While research on explainable AI (XAI) is booming and explanation techniques have proven promising in many application domains, standardised human-centred evaluation procedures are still missing. In addition, current evaluation procedures do not assess XAI methods holistically in the sense that they do not treat explanations' effects on humans as a complex user experience. To tackle this challenge, we propose to adapt the User-Centric Evaluation Framework used in recommender systems: we integrate explanation aspects, summarise explanation properties, indicate relations between them, and categorise metrics that measure these properties. With this comprehensive evaluation framework, we hope to contribute to the human-centred standardisation of XAI evaluation.