AITopics | attribution score

Collaborating Authors

attribution score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RoSHAP: A Distributional Framework and Robust Metric for Stable Feature Attribution

Xiang, Lanxin, Shi, Liang, Ye, Youhui, Jiang, Boyu, Zhou, Dawei, Guo, Feng

arXiv.org Machine LearningMay-15-2026

Feature attribution analysis is critical for interpreting machine learning models and supporting reliable data-driven decisions. However, feature attribution measures often exhibit stochastic variation: different train--test splits, random seeds, or model-fitting procedures can produce substantially different attribution values and feature rankings. This paper proposes a framework for incorporating stochastic nature of feature attribution and a robust attribution metric, RoSHAP, for stable feature ranking based on the SHAP metric. The proposed framework models the distribution of feature attribution scores and estimates it through bootstrap resampling and kernel density estimation. We show that, under mild regularity conditions, the aggregated feature attribution score is asymptotically Gaussian, which greatly reduces the computational cost of distribution estimation. The RoSHAP summarizes the distribution of SHAP into a robust feature-ranking criterion that simultaneously rewards features that are active, strong, and stable. Through simulations and real-data experiments, the proposed framework and RoSHAP outperform standard single-run attribution measures in identifying signal features. In addition, models built using RoSHAP-selected features achieve predictive performance comparable to full-feature models while using substantially fewer predictors. The proposed RoSHAP approach improves the stability and interpretability of machine learning models, enabling reliable and consistent insights for analysis.

artificial intelligence, machine learning, selection performance comparison, (15 more...)

arXiv.org Machine Learning

2605.15154

Country: North America > United States (0.47)

Genre: Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback

dattri: A Library for Efficient Data Attribution Junwei Deng 1* Ting-Wei Li

Neural Information Processing SystemsFeb-18-2026, 18:02:54 GMT

Data attribution methods aim to quantify the influence of individual training samples on the prediction of artificial intelligence (AI) models.

artificial intelligence, data attribution method, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Towards Neuron Attributions in Multimodal Large Language Models

Neural Information Processing SystemsFeb-18-2026, 09:40:26 GMT

As Large Language Models (LLMs) demonstrate impressive capabilities, demys-tifying their internal mechanisms becomes increasingly vital.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ContextCite: Attributing Model Generation to Context Benjamin Cohen-Wang

Neural Information Processing SystemsFeb-17-2026, 10:09:40 GMT

How do language models use information provided as context when generating a response?

attribution, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Leicestershire > Leicester (0.04)
(7 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Government (1.00)
Media (0.93)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Post Hoc Explanations of Language Models Can Improve Language Models Satyapriya Krishna

Neural Information Processing SystemsFeb-17-2026, 04:44:21 GMT

Our work makes one of the first attempts at highlighting the potential of post hoc explanations as valuable tools for enhancing the effectiveness of LLMs.

explanation, large language model, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.82)

Add feedback

Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution

Neural Information Processing SystemsFeb-9-2026, 22:43:03 GMT

Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability.

attribution method, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Switzerland (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Nuclear Medicine (0.46)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

AD-DROP: Attribution-DrivenDropoutforRobust LanguageModelFine-Tuning

Neural Information Processing SystemsFeb-8-2026, 22:12:47 GMT

Pre-training large language models (PrLMs) on massive unlabeled corpora and fine-tuning them on downstream tasks has become a new paradigm [1-3]. Their success can be partly attributed to the self-attention mechanism [4], yet these self-attention networks are often redundant [5, 6] and tend to cause overfitting when fine-tuned on downstream tasks due to the mismatch between their overparameterization and the limited annotated data [7-13]. To address this issue, various regularization techniques such as data augmentation [14, 15], adversarial training [16, 17]), and dropout-based methods [11,13,18]have been developed.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

1487987e862c44b91a0296cf3866387e-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 13:54:56 GMT

attribution, motif, sequence, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

What Triggers my Model? Contrastive Explanations Inform Gender Choices by Translation Models

Hackenbuchner, Janiça, Tezcan, Arda, Daems, Joke

arXiv.org Artificial IntelligenceDec-10-2025

Interpretability can be implemented as a means to understand decisions taken by (black box) models, such as machine translation (MT) or large language models (LLMs). Yet, research in this area has been limited in relation to a manifested problem in these models: gender bias. With this research, we aim to move away from simply measuring bias to exploring its origins. Working with gender-ambiguous natural source data, this study examines which context, in the form of input tokens in the source sentence, influences (or triggers) the translation model choice of a certain gender inflection in the target language. To analyse this, we use contrastive explanations and compute saliency attribution. We first address the challenge of a lacking scoring threshold and specifically examine different attribution levels of source words on the model gender decisions in the translation. We compare salient source words with human perceptions of gender and demonstrate a noticeable overlap between human perceptions and model attribution. Additionally, we provide a linguistic analysis of salient words. Our work showcases the relevance of understanding model translation decisions in terms of gender, how this compares to human decisions and that this information should be leveraged to mitigate gender bias.

artificial intelligence, computational linguistic, natural language, (13 more...)

arXiv.org Artificial Intelligence

2512.0844

Country: Europe > United Kingdom > England (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

MACIE: Multi-Agent Causal Intelligence Explainer for Collective Behavior Understanding

Weinberg, Abraham Itzhak

arXiv.org Artificial IntelligenceNov-21-2025

As Multi Agent Reinforcement Learning systems are used in safety critical applications. Understanding why agents make decisions and how they achieve collective behavior is crucial. Existing explainable AI methods struggle in multi agent settings. They fail to attribute collective outcomes to individuals, quantify emergent behaviors, or capture complex interactions. We present MACIE Multi Agent Causal Intelligence Explainer, a framework combining structural causal models, interventional counterfactuals, and Shapley values to provide comprehensive explanations. MACIE addresses three questions. First, each agent's causal contribution using interventional attribution scores. Second, system level emergent intelligence through synergy metrics separating collective effects from individual contributions. Third, actionable explanations using natural language narratives synthesizing causal insights. We evaluate MACIE across four MARL scenarios: cooperative, competitive, and mixed motive. Results show accurate outcome attribution, mean phi_i equals 5.07, standard deviation less than 0.05, detection of positive emergence in cooperative tasks, synergy index up to 0.461, and efficient computation, 0.79 seconds per dataset on CPU. MACIE uniquely combines causal rigor, emergence quantification, and multi agent support while remaining practical for real time use. This represents a step toward interpretable, trustworthy, and accountable multi agent AI.

agent, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.15716

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report > New Finding (1.00)

Industry: