Selective Explanations

May-27-2025, 03:38:14 GMT–Neural Information Processing Systems

Feature attribution methods explain black-box machine learning (ML) models by assigning importance scores to input features. These methods can be computationally expensive for large ML models. To address this challenge, there have been increasing efforts to develop amortized explainers, where a ML model is trained to efficiently approximate computationally expensive feature attribution scores. Despite their efficiency, amortized explainers can produce misleading explanations. In this paper, we propose selective explanations to (i) detect when amortized explainers generate inaccurate explanations and (ii) improve the approximation of the explanation using a technique we call explanations with initial guess.

artificial intelligence, explanation, machine learning, (6 more...)

Neural Information Processing Systems

May-27-2025, 03:38:14 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)