Enhancing Model Interpretability with Local Attribution over Global Exploration

Zhu, Zhiyu, Jin, Zhibo, Zhang, Jiayu, Chen, Huaming

Aug-14-2024–arXiv.org Artificial Intelligence

In the field of artificial intelligence, AI models are frequently described as `black boxes' due to the obscurity of their internal mechanisms. It has ignited research interest on model interpretability, especially in attribution methods that offers precise explanations of model decisions. Current attribution algorithms typically evaluate the importance of each parameter by exploring the sample space. A large number of intermediate states are introduced during the exploration process, which may reach the model's Out-of-Distribution (OOD) space. Such intermediate states will impact the attribution results, making it challenging to grasp the relative importance of features. In this paper, we firstly define the local space and its relevant properties, and we propose the Local Attribution (LA) algorithm that leverages these properties. The LA algorithm comprises both targeted and untargeted exploration phases, which are designed to effectively generate intermediate states for attribution that thoroughly encompass the local space. Compared to the state-of-the-art attribution methods, our approach achieves an average improvement of 38.21\% in attribution effectiveness. Extensive ablation studies in our experiments also validate the significance of each component in our algorithm. Our code is available at: https://github.com/LMBTough/LA/

agi attexplore big deeplift prediction, attribution, local attribution, (9 more...)

arXiv.org Artificial Intelligence

Aug-14-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.06)
  - New South Wales > Sydney (0.04)
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - New Mexico > Bernalillo County
    - Albuquerque (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Information Technology (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found