A Feature Importance Explanation Methods

Mar-23-2025, 22:59:33 GMT–Neural Information Processing Systems

We briefly review several FI explanation methods and explain how they are used in this paper. These methods can be classified as gradient-based (1-2), attention-based (3), and perturbation-based (4-7). Note that when computing derivatives of model outputs for explanation methods, we use the logit of the predicted class rather than the predicted probability for purposes of numerical stability. This method estimates the integral in Integrated Gradients [54] by Monte Carlo sampling in order to speed up computation, and it uses the data distribution to obtain baseline inputs. D using the training dataset D. We consider alternative baselines x This approach treats attention weights in a model as an explanation of model feature importance. For the Up-Down model [2], we use its sole set of top-down attention weights, but early experiments suggest this is not an effective method and we do not explore it further.

artificial intelligence, explanation, machine learning, (15 more...)

Neural Information Processing Systems

Mar-23-2025, 22:59:33 GMT

Conferences PDF

Add feedback

Genre:
- Research Report (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)