Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance

Neural Information Processing Systems 

Interpretability methods are valuable only if their explanations faithfully describe the explained model. In this work, we consider neural networks whose predictions are invariant under a specific symmetry group. This includes popular architectures, ranging from convolutional to graph neural networks. Any explanation that faithfully explains this type of model needs to be in agreement with this invariance property.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found