Negative Flux Aggregation to Estimate Feature Attributions
Li, Xin, Pan, Deng, Li, Chengyin, Qiang, Yao, Zhu, Dongxiao
–arXiv.org Artificial Intelligence
Gradient based methods such as Saliency Map [Simonyan Due to multi-layer nonlinearity of the deep neural et al., 2013], SmoothGrad [Smilkov et al., 2017], Full-network architectures, explaining DNN predictions Grad [Srinivas and Fleuret, 2019], Integrated Gradient (IG) still remains as an open problem, preventing and its variants [Sundararajan et al., 2017; Hesse et al., 2021; us from gaining a deeper understanding of the mechanisms. Erion et al., 2021; Pan et al., 2021; Kapishnikov et al., 2019; To enhance the explainability of DNNs, Kapishnikov et al., 2021] require neither surrogates nor customized we estimate the input feature's attributions to the rules but must tackle unstable estimates of gradients prediction task using divergence and flux. Inspired w.r.t. the given inputs. IG type of path integration based by the divergence theorem in vector analysis, we methods mitigate this issue via a path integration for gradient develop a novel Negative Flux Aggregation (Ne-smoothing, however, this also introduces another degree FLAG) formulation and an efficient approximation of instability and noise sourced from arbitrary selections of algorithm to estimate attribution map.
arXiv.org Artificial Intelligence
May-13-2023