Reviews: A Benchmark for Interpretability Methods in Deep Neural Networks

Feb-5-2025, 22:06:18 GMT–Neural Information Processing Systems

Summary --- This paper proposes to evaluate saliency/importance visual explanations by removing "important" pixels and measuring whether a re-trained classifier can still classify such images correctly. Many explanations fail to remove such class-relevant information, but some ensembling techniques succeed by completely removing objects. Those are said to be better explanations. This paper takes the view that important information is that information which a classifier can use to predict the correct label. As a result, we can measure whether an importance estimate is good by measuring how much performance drops when the important pixels are removed from all images in both train and val sets.

explanation, importance estimator, roar, (12 more...)

Neural Information Processing Systems

Feb-5-2025, 22:06:18 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)