AIDE: Antithetical, Intent-based, and Diverse Example-Based Explanations
Nematov, Ikhtiyor, Sacharidis, Dimitris, Sagi, Tomer, Hose, Katja
–arXiv.org Artificial Intelligence
For many use-cases, it is often important to explain the prediction of a black-box model by identifying the most influential training data samples. Existing approaches lack customization for user intent and often provide a homogeneous set of explanation samples, failing to reveal the model's reasoning from different angles. In this paper, we propose AIDE, an approach for providing antithetical (i.e., contrastive), intent-based, diverse explanations for opaque and complex models. AIDE distinguishes three types of explainability intents: interpreting a correct, investigating a wrong, and clarifying an ambiguous prediction. For each intent, AIDE selects an appropriate set of influential training samples that support or oppose the prediction either directly or by contrast. To provide a succinct summary, AIDE uses diversity-aware sampling to avoid redundancy and increase coverage of the training data. We demonstrate the effectiveness of AIDE on image and text classification tasks, in three ways: quantitatively, assessing correctness and continuity; qualitatively, comparing anecdotal evidence from AIDE and other example-based approaches; and via a user study, evaluating multiple aspects of AIDE. The results show that AIDE addresses the limitations of existing methods and exhibits desirable traits for an explainability method.
arXiv.org Artificial Intelligence
Aug-8-2024
- Country:
- North America
- Europe
- United Kingdom (0.04)
- Belgium (0.04)
- Austria (0.04)
- Italy > Marche
- Ancona Province > Ancona (0.04)
- Denmark > North Jutland
- Aalborg (0.04)
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report > New Finding (0.48)
- Technology: