AITopics | attribution method

Collaborating Authors

attribution method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Attributions All the Way Down? The Metagame of Interpretability

Baniecki, Hubert, Biecek, Przemyslaw, Fumagalli, Fabian

arXiv.org Machine LearningMay-8-2026

We introduce the metagame, a conceptual framework for quantifying second-order interaction effects of model explanations. For any first-order attribution $ϕ(f)$ explaining a model $f$, we measure the directional influence of feature $j$ on the attribution of feature $i$, denoted as meta-attribution $φ_{j \to i}(f)$, by treating the attribution method itself as a cooperative game and computing its Shapley value. Theoretically, we prove that attributions hierarchically decompose into meta-attributions, and establish these as directional extensions of existing interaction indices. Empirically, we demonstrate that the metagame delivers insights across diverse interpretability applications: (i) quantifying token interactions in instruction-tuned language models, (ii) explaining cross-modal similarity in vision-language encoders, and (iii) interpreting text-to-image concepts in multimodal diffusion transformers.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2605.06295

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment (0.93)
Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting

Essafouri, Younes, Raynaud, Laure, Drozda, Luciano, Risser, Laurent

arXiv.org Machine LearningApr-27-2026

As the demand to integrate Artificial Intelligence into high-stakes environments continues to grow, explaining the reasoning behind neural-network predictions has shifted from a theoretical curiosity to a strict operational requirement. Our work is motivated by the explanations of autoregressive neural predictions on dynamic physical fields, as in weather forecasting. Gradient-based feature attribution methods are widely used to explain the predictions on such data, in particular due to their scalability to high-dimensional inputs. It is also interesting to remark that gradient-based techniques such as SmoothGrad are now standard on images to robustify the explanations using pointwise averages of the attribution maps obtained from several noised inputs. Our goal is to efficiently adapt this aggregation strategy to dynamic physical fields. To do so, our first contribution is to identify a fundamental failure mode when averaging perturbed attribution maps on dynamic physical fields: stochastic input perturbations do not induce stationary amplitude noise in attribution maps, but instead cause a geometric displacement of the attributions. Consequently, pointwise averaging blurs these spatially misaligned features. To tackle this issue, we introduce WassersteinGrad, which extracts a geometric consensus of perturbed attribution maps by computing their entropic Wasserstein barycenter. The results, obtained on regional weather data and a meteorologist-validated neural model, demonstrate promising explainability properties of WassersteinGrad over gradient-based baselines across both single-step and autoregressive forecasting settings.

artificial intelligence, displacement, machine learning, (19 more...)

arXiv.org Machine Learning

2604.2258

Country: Europe > France (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

1305_making_sense_of_dependence_eff

Paul Novello

Neural Information Processing SystemsApr-24-2026, 22:45:48 GMT

In this part, we state the orthogonal decomposition Property, motivate its importance with a pedagogical example, and finally prove Proposition 1, which enables the decomposition property in the context of HSIC attribution method. A.1 Orthogonal Decomposition Property Let x = {x1,..., xn}2Xn be a set of n univariate random input variables. For any subset A = {l1,...,l |A|} { 1,...,n}, we denote xA =( xl1,..., xl|A|) the vector of input variables with indices in A. Let y the random output variable defined by y = f(x), F the RKHS defined by the kernel kA: X|A|! R and G the RKHS defined by the kernel l: Y! R. In [11], the author shows that for any choice of kernel l, if we respect some constraints on the kernel kA, we can construct indices HSIC (xA,y) that satisfy the following decomposition property. The constraints on the kernel kA are detailed in the main document and in the last section of this appendix.

artificial intelligence, hsic, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure

Neural Information Processing SystemsApr-24-2026, 22:45:43 GMT

This paper presents a new efficient black-box attribution method built on HilbertSchmidt Independence Criterion (HSIC). Based on Reproducing Kernel Hilbert Spaces (RKHS), HSIC measures the dependence between regions of an input image and the output of a model using the kernel embedding of their distributions. It thus provides explanations enriched by RKHS representation capabilities. HSIC can be estimated very efficiently, significantly reducing the computational cost compared to other black-box attribution methods. Our experiments show that HSIC is up to 8 times faster than the previous best black-box attribution methods while being as faithful. Indeed, we improve or match the state-of-the-art of both black-box and white-box attribution methods for several fidelity metrics on Imagenet with various recent model architectures. Importantly, we show that these advances can be transposed to efficiently and faithfully explain object detection models such as YOLOv4. Finally, we extend the traditional attribution methods by proposing a new kernel enabling an ANOVA-like orthogonal decomposition of importance scores based on HSIC, allowing us to evaluate not only the importance of each image patch but also the importance of their pairwise interactions.

attribution method, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report (0.67)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Vision (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

0fe6a94848e5c68a54010b61b3e94b0e-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 17:51:36 GMT

Post-hoc gradient-based interpretability methods [1, 2] that provide instancespecific explanations of model predictions are often based on assumption (A): magnitude of input gradients--gradients of logits with respect to input--noisily highlight discriminative task-relevant features. In this work, we test the validity of assumption (A) using a three-pronged approach: 1. We develop an evaluation framework, DiffROAR, to test assumption (A) on four image classification benchmarks. Our results suggest that (i) input gradients of standard models (i.e., trained on original data) may grossly violate (A), whereas (ii) input gradients of adversarially robust models satisfy (A) reasonably well.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Italy (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

0fe6a94848e5c68a54010b61b3e94b0e-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 17:51:32 GMT

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Italy (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

experiments

Neural Information Processing SystemsApr-24-2026, 16:48:45 GMT

A.1 Experimental design Figure 1 summarizes the experimental design used for our experiments. The participants that went through our experiments are users from the online platform Amazon Mechanical Turk (AMT). Through this platform, users stay anonymous, hence, we do not collect any sensitive personal information about them. We prioritized users with a Master qualification (which is a qualification attributed by AMT to users who have proven to be of excellent quality) or normal users with high qualifications (number of HIT completed = 10000and HIT accepted > 98%). Before going through the experiment, participants are asked to read and agree to a consent form, which specifies: the objective and procedure of the experiment, as well as the time expected to completion ( 5 - 8 min) with the reward associated ($1.4), and finally, the risk, benefits, and confidentiality of taking part in this study.

artificial intelligence, experiment, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Communications > Social Media > Crowdsourcing (0.34)

Add feedback

13113e938f2957891c0c5e8df811dd01-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 16:48:41 GMT

explanation, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Government > Regional Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

M4: AUnified XAIBenchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models

Neural Information Processing SystemsApr-24-2026, 08:14:08 GMT

While Explainable Artificial Intelligence (XAI) techniques have been widely studied to explain predictions made by deep neural networks, the way to evaluate the faithfulness of explanation results remains challenging, due to the heterogeneity of explanations for various models and the lack of ground-truth explanations. This paper introduces an XAI benchmark named M4, which allows evaluating various input feature attribution methods using the same set of faithfulness metrics across multiple data modalities (images and texts) and network structures (ResNets, MobileNets, Transformers). A taxonomy for the metrics has been proposed as well. We first categorize commonly used XAI evaluation metrics into three groups based on the ground truth they require. We then implement classic and state-of-the-art feature attribution methods using InterpretDL and conduct extensive experiments to compare methods and gain insights. Extensive experiments have been conducted to provide holistic evaluations as benchmark baselines. Several interesting observations are made for designing attribution algorithms.

attribution method, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Filters

Collaborating Authors

attribution method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Attributions All the Way Down? The Metagame of Interpretability

ab5a2bf4385bee44f3919060b184605b-Paper-Conference.pdf

Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting

1305_making_sense_of_dependence_eff

Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure

0fe6a94848e5c68a54010b61b3e94b0e-Supplemental.pdf

0fe6a94848e5c68a54010b61b3e94b0e-Paper.pdf

experiments

13113e938f2957891c0c5e8df811dd01-Paper-Conference.pdf

M4: AUnified XAIBenchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models