Self-Interpretable Model with Transformation Equivariant Interpretation

Oct-9-2024, 14:02:09 GMT–Neural Information Processing Systems

With the proliferation of machine learning applications in the real world, the demand for explaining machine learning predictions continues to grow especially in high-stakes fields. Recent studies have found that interpretation methods can be sensitive and unreliable, where the interpretations can be disturbed by perturbations or transformations of input data. To address this issue, we propose to learn robust interpretation through transformation equivariant regularization in a self-interpretable model. The resulting model is capable of capturing valid interpretation that is equivariant to geometric transformations. Moreover, since our model is self-interpretable, it enables faithful interpretations that reflect the true predictive mechanism.

interpretation, self-interpretable model, transformation equivariant interpretation, (2 more...)

Neural Information Processing Systems

Oct-9-2024, 14:02:09 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)