AITopics | attribution

Collaborating Authors

attribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

Fanale, Raimondo

arXiv.org Machine LearningMay-20-2026

The main XAI attribution methods for deep neural networks -- GradCAM, SHAP, LIME, Integrated Gradients -- operate on separate theoretical foundations and are not formally comparable. We present GRALIS (Gradient-Riesz Averaged Locally-Integrated Shapley), a mathematical framework establishing a representation theory for attributions: every additive, linear, and continuous attribution functional on L^2(Q,mu) admits a unique canonical representation (Q, w, Delta), proved necessary by the Riesz Representation Theorem. This class encompasses SHAP, IG, LIME and linearized GradCAM, but excludes nonlinear functionals such as standard GradCAM or attention maps. Seven formal theorems provide simultaneous guarantees absent in any individual method: (T1) necessary canonical form; (T2) exact completeness; (T3) Monte Carlo convergence O(1/sqrt(m))+O(1/k); (T4) exact Shapley Interaction Values; (T5) Hoeffding ANOVA decomposition; (T6) Sobol sensitivity generalization; (T7) multi-scale extension (MS-GRALIS) with minimum-variance weights. An algebraic appendix justifies the GRALIS-SIV correspondence via the Mobius transform without circularity. GRALIS satisfies 13.5/14 axiomatic properties vs. 2.5-6/14 for individual methods, including completeness, sensitivity, locality, order-k interactions and optimal multi-scale aggregation simultaneously. Preliminary validation on BreaKHis (1,187 histology images, DenseNet-121) reports deletion faithfulness AUC +0.015 (malignant), 96% class-conditional consistency, SAL = 0.762+/-0.109 and sparsity index 0.39. Extended comparison with baseline XAI methods is planned for a companion paper.

artificial intelligence, gralis, machine learning, (20 more...)

arXiv.org Machine Learning

2605.0548

Country: Europe > Italy (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew

Serrà, Joan, Goswami, Dipam, Morreale, Fabio, Liao, Wei-Hsiang, Mitsufuji, Yuki

arXiv.org Machine LearningMay-19-2026

Training data attribution (TDA) should enable generative model interpretability and foster a variety of related downstream tasks. Nonetheless, current TDA approaches lack reliability and robustness, preventing their adoption in real-world setups. In this paper, we take a decisive step towards more reliable and robust TDA for diffusion models. We propose to perform TDA with mirrored unlearning and noise-consistent skew (MUCS). The idea is to fine-tune a second model with bounded mirrored gradient ascent, and to measure the normalized skew of this model with respect to the original one using consistent noise samples. We show that, while being conceptually simple and generic, MUCS systematically outperforms existing methods on three different datasets by a large margin. We additionally study the effect that core design choices have on final performance, and analyze novel aspects regarding the overlap of influential instances across generated items and the potential of ensembling TDA approaches. We believe that our findings may have broader implications for more general unlearning setups, as well as for tasks requiring the comparison of diffusion losses.

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Machine Learning

2605.17938

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multiscale Euclidean Network Trajectories: Second-Moment Geometry, Attribution, and Change Points

Ezoe, Haruka, Hisano, Ryohei

arXiv.org Machine LearningMay-12-2026

A central challenge in dynamic network analysis is to represent temporal evolution in a way that is both geometrically meaningful and statistically identifiable. One approach embeds a sequence of network snapshots as trajectories in a Euclidean space and relates these trajectories to node embeddings. In multilayer and unfolded spectral constructions, however, node embeddings and their underlying latent positions are identifiable only up to general linear transformations. Although this ambiguity preserves edge probabilities, it can distort geometry and invalidate distance based temporal comparisons at both the trajectory and node-levels. We develop Multiscale Euclidean Network Trajectories (MENT), a framework for multiscale temporal trajectories based on second-moment geometry. By imposing an isotropic normalization on the anchor latent positions, we reduce the relevant ambiguity to orthogonal transformations and prevent distortion of the second-moment geometry. In this canonical representation, we define a trace variation distance and mode-wise variation distances along orthogonal directions, and use multidimensional scaling to obtain low-dimensional trajectories of time points at both global and mode-wise levels. The resulting trajectories support interpretation and inference. They admit mode-wise decompositions, support attribution of global and mode-wise temporal changes to nodes, and enable change point detection through 1D trajectories. We prove consistency of the proposed unfolded spectral embedding and of the induced temporal trajectories. Experiments on two synthetic and two real dynamic networks illustrate stable and interpretable recovery of temporal structure and show strong performance against existing change point detection baselines.

data mining, machine learning, trajectory, (20 more...)

arXiv.org Machine Learning

2605.04589

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.67)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Attributions All the Way Down? The Metagame of Interpretability

Baniecki, Hubert, Biecek, Przemyslaw, Fumagalli, Fabian

arXiv.org Machine LearningMay-8-2026

We introduce the metagame, a conceptual framework for quantifying second-order interaction effects of model explanations. For any first-order attribution $ϕ(f)$ explaining a model $f$, we measure the directional influence of feature $j$ on the attribution of feature $i$, denoted as meta-attribution $φ_{j \to i}(f)$, by treating the attribution method itself as a cooperative game and computing its Shapley value. Theoretically, we prove that attributions hierarchically decompose into meta-attributions, and establish these as directional extensions of existing interaction indices. Empirically, we demonstrate that the metagame delivers insights across diverse interpretability applications: (i) quantifying token interactions in instruction-tuned language models, (ii) explaining cross-modal similarity in vision-language encoders, and (iii) interpreting text-to-image concepts in multimodal diffusion transformers.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2605.06295

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment (0.93)
Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

Validating the Clinical Utility of CineECG 3D Reconstructions through Cross-Modal Feature Attribution

Dobiczek, Karol, Mozolewski, Maciej, Bobek, Szymon, Szafarczyk, Michał, van Dam, Peter, Nalepa, Grzegorz J.

arXiv.org Machine LearningMay-1-2026

Deep learning models for 12-lead electrocardiogram (ECG) analysis achieve high diagnostic performance but lack the intuitive interpretability required for clinical integration. Standard feature attribution methods are limited by the inherent difficulty in mapping abstract waveform fluctuations to physical anatomical pathologies. To resolve this, we propose a cross-modal method that projects feature attributions from high-performance 12-lead ECG models onto the CineECG 3D anatomical space. Our study reveals that while models trained directly on CineECG signals suffer from reduced accuracy and incoherent attributions, the proposed mapping mechanism effectively recovers clinically relevant feature rankings. Validated against a ground-truth dataset of 20 cases annotated by domain experts, the mapped explanations yield a Dice score of 0.56, significantly outperforming the 0.47 baseline of standard 12-lead attributions. These findings indicate that cross-modal averaging mapping effectively filters attribution instability and improves the localization of pathological features, combining the diagnostic expressiveness of standard ECG with the intuitive clarity of anatomical visualization.

artificial intelligence, attribution, machine learning, (15 more...)

arXiv.org Machine Learning

2604.27017

Country:

Europe > Poland (0.15)
North America > United States (0.14)
Asia > Thailand (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpreting Representation Quality of DNNs for 3DPoint Cloud Processing

Neural Information Processing SystemsApr-25-2026, 18:22:56 GMT

In this paper, we evaluate the quality of knowledge representations encoded in deep neural networks (DNNs) for 3D point cloud processing. We propose a method to disentangle the overall model vulnerability into the sensitivity to the rotation, the translation, the scale, and local 3D structures. Besides, we also propose metrics to evaluate the spatial smoothness of encoding 3D structures, and the representation complexity of the DNN. Based on such analysis, experiments expose representation problems with classic DNNs, and explain the utility of the adversarial training. The code will be released when this paper is accepted.

artificial intelligence, machine learning, sensitivity, (16 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

0fe6a94848e5c68a54010b61b3e94b0e-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 17:51:36 GMT

Post-hoc gradient-based interpretability methods [1, 2] that provide instancespecific explanations of model predictions are often based on assumption (A): magnitude of input gradients--gradients of logits with respect to input--noisily highlight discriminative task-relevant features. In this work, we test the validity of assumption (A) using a three-pronged approach: 1. We develop an evaluation framework, DiffROAR, to test assumption (A) on four image classification benchmarks. Our results suggest that (i) input gradients of standard models (i.e., trained on original data) may grossly violate (A), whereas (ii) input gradients of adversarially robust models satisfy (A) reasonably well.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Italy (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

0fe6a94848e5c68a54010b61b3e94b0e-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 17:51:32 GMT

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Italy (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Identifying interactions at scale for LLMs

AIHubApr-21-2026, 13:37:46 GMT

Understanding the behavior of complex machine learning systems, particularly Large Language Models (LLMs), is a critical challenge in modern artificial intelligence. Interpretability research aims to make the decision-making process more transparent to model builders and impacted humans, a step toward safer and more trustworthy AI. To achieve state-of-the-art performance, models synthesize complex feature relationships, find shared patterns from diverse training examples, and process information through highly interconnected internal components. In this blog post, we describe the fundamental ideas behind SPEX and ProxySPEX, algorithms capable of identifying these critical interactions at scale. We mask or remove specific segments of the input prompt and measure the resulting shift in the predictions.

large language model, machine learning, natural language, (18 more...)

AIHub

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

attribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew

Multiscale Euclidean Network Trajectories: Second-Moment Geometry, Attribution, and Change Points

Attributions All the Way Down? The Metagame of Interpretability

Validating the Clinical Utility of CineECG 3D Reconstructions through Cross-Modal Feature Attribution

Interpreting Representation Quality of DNNs for 3DPoint Cloud Processing

16e4be78e61a3897665fa01504e9f452-Paper-Conference.pdf

0fe6a94848e5c68a54010b61b3e94b0e-Supplemental.pdf

0fe6a94848e5c68a54010b61b3e94b0e-Paper.pdf

Identifying interactions at scale for LLMs