AITopics | shortcut feature

Collaborating Authors

shortcut feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Shortcut Features as Top Eigenfunctions of NTK: ALinear Neural Network Case and More

Neural Information Processing SystemsJun-19-2026, 09:45:49 GMT

One of the chronic problems of deep-learning models is shortcut learning. In a case where the majority of training data are dominated by a certain feature, neural networks prefer to learn such a feature even if the feature is not generalizable outside the training set. Based on the framework of Neural Tangent Kernel (NTK), we analyzed the case of linear neural networks to derive some important properties of shortcut learning. We defined a "feature" of a neural network as an eigenfunction of NTK. Then, we found that shortcut features correspond to features with larger eigenvalues when the shortcuts stem from the imbalanced number of samples in the clustered distribution. We also showed that the features with larger eigenvalues still have a large influence on the neural network output even after training, due to data variances in the clusters. Such a preference for certain features remains even when a margin of a neural network output is controlled, which shows that the max-margin bias is not the only major reason for shortcut learning. These properties of linear neural networks are empirically extended for more complex neural networks as a two-layer fully-connected ReLU network and a ResNet-18.

artificial intelligence, machine learning, shortcut label, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Preventing Shortcut Learning in Medical Image Analysis through Intermediate Layer Knowledge Distillation from Specialist Teachers

Boland, Christopher, Tsaftaris, Sotirios, Dahdouh, Sonia

arXiv.org Artificial IntelligenceNov-25-2025

Deep learning models are prone to learning shortcut solutions to problems using spuriously correlated yet irrelevant features of their training data. In high-risk applications such as medical image analysis, this phenomenon may prevent models from using clinically meaningful features when making predictions, potentially leading to poor robustness and harm to patients. We demonstrate that different types of shortcuts (those that are diffuse and spread throughout the image, as well as those that are localized to specific areas) manifest distinctly across network layers and can, therefore, be more effectively targeted through mitigation strategies that target the intermediate layers. We propose a novel knowledge distillation framework that leverages a teacher network fine-tuned on a small subset of task-relevant data to mitigate shortcut learning in a student network trained on a large dataset corrupted with a bias feature. Through extensive experiments on CheXpert, ISIC 2017, and SimBA datasets using various architectures (ResNet-18, AlexNet, DenseNet-121, and 3D CNNs), we demonstrate consistent improvements over traditional Empirical Risk Minimization, augmentation-based bias-mitigation, and group-based bias-mitigation approaches. In many cases, we achieve comparable performance with a baseline model trained on bias-free data, even on out-of-distribution test data. Our results demonstrate the practical applicability of our approach to real-world medical imaging scenarios where bias annotations are limited and shortcut features are difficult to identify a priori.

artificial intelligence, machine learning, training data, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.59275/j.melba.2025-8888

2511.17421

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gradient-based Model Shortcut Detection for Time Series Classification

Ibarra, Salomon, Cantu, Frida, Zhou, Kaixiong, Zhang, Li

arXiv.org Artificial IntelligenceOct-14-2025

Deep learning models have attracted lots of research attention in time series classification (TSC) task in the past two decades. Recently, deep neural networks (DNN) have surpassed classical distance-based methods and achieved state-of-the-art performance. Despite their promising performance, deep neural networks (DNNs) have been shown to rely on spurious correlations present in the training data, which can hinder generalization. For instance, a model might incorrectly associate the presence of grass with the label ``cat" if the training set have majority of cats lying in grassy backgrounds. However, the shortcut behavior of DNNs in time series remain under-explored. Most existing shortcut work are relying on external attributes such as gender, patients group, instead of focus on the internal bias behavior in time series models. In this paper, we take the first step to investigate and establish point-based shortcut learning behavior in deep learning time series classification. We further propose a simple detection method based on other class to detect shortcut occurs without relying on test data or clean training classes. We test our proposed method in UCR time series datasets.

artificial intelligence, machine learning, shortcut, (15 more...)

arXiv.org Artificial Intelligence

2510.10075

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Disentangling Bias by Modeling Intra- and Inter-modal Causal Attention for Multimodal Sentiment Analysis

Jiang, Menghua, Lin, Yuxia, Chen, Baoliang, Hu, Haifeng, Jiang, Yuncheng, Mai, Sijie

arXiv.org Artificial IntelligenceAug-8-2025

Multimodal sentiment analysis (MSA) aims to understand human emotions by integrating information from multiple modalities, such as text, audio, and visual data. However, existing methods often suffer from spurious correlations both within and across modalities, leading models to rely on statistical shortcuts rather than true causal relationships, thereby undermining generalization. To mitigate this issue, we propose a Multi-relational Multimodal Causal Intervention (MMCI) model, which leverages the backdoor adjustment from causal theory to address the confounding effects of such shortcuts. Specifically, we first model the multimodal inputs as a multi-relational graph to explicitly capture intra- and inter-modal dependencies. Then, we apply an attention mechanism to separately estimate and disentangle the causal features and shortcut features corresponding to these intra- and inter-modal relations. Finally, by applying the backdoor adjustment, we stratify the shortcut features and dynamically combine them with the causal features to encourage MMCI to produce stable predictions under distribution shifts. Extensive experiments on several standard MSA datasets and out-of-distribution (OOD) test sets demonstrate that our method effectively suppresses biases and improves performance.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.04999

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.72)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)
(3 more...)

Add feedback

Mitigating Shortcut Learning with InterpoLated Learning

Korakakis, Michalis, Vlachos, Andreas, Weller, Adrian

arXiv.org Artificial IntelligenceJul-9-2025

Empirical risk minimization (ERM) incentivizes models to exploit shortcuts, i.e., spurious correlations between input attributes and labels that are prevalent in the majority of the training data but unrelated to the task at hand. This reliance hinders generalization on minority examples, where such correlations do not hold. Existing shortcut mitigation approaches are model-specific, difficult to tune, computationally expensive, and fail to improve learned representations. To address these issues, we propose InterpoLated Learning (InterpoLL) which interpolates the representations of majority examples to include features from intra-class minority examples with shortcut-mitigating patterns. This weakens shortcut influence, enabling models to acquire features predictive across both minority and majority examples. Experimental results on multiple natural language understanding tasks demonstrate that InterpoLL improves minority generalization over both ERM and state-of-the-art shortcut mitigation methods, without compromising accuracy on majority examples. Notably, these gains persist across encoder, encoder-decoder, and decoder-only architectures, demonstrating the method's broad applicability.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.05527

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.67)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse

Wang, Yining, Sun, Junjie, Wang, Chenyue, Zhang, Mi, Yang, Min

arXiv.org Artificial IntelligenceMay-9-2024

Recent studies have noted an intriguing phenomenon termed Neural Collapse, that is, when the neural networks establish the right correlation between feature spaces and the training targets, their last-layer features, together with the classifier weights, will collapse into a stable and symmetric structure. In this paper, we extend the investigation of Neural Collapse to the biased datasets with imbalanced attributes. We observe that models will easily fall into the pitfall of shortcut learning and form a biased, non-collapsed feature space at the early period of training, which is hard to reverse and limits the generalization capability. To tackle the root cause of biased classification, we follow the recent inspiration of prime training, and propose an avoid-shortcut learning framework without additional training complexity. With well-designed shortcut primes based on Neural Collapse structure, the models are encouraged to skip the pursuit of simple shortcuts and naturally capture the intrinsic correlations. Experimental results demonstrate that our method induces better convergence properties during training, and achieves state-of-the-art generalization performance on both synthetic and real-world biased datasets.

correlation, dataset, neural collapse, (17 more...)

arXiv.org Artificial Intelligence

2405.05587

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Adaptive Shortcut Debiasing for Online Continual Learning

Kim, Doyoung, Park, Dongmin, Shin, Yooju, Bang, Jihwan, Song, Hwanjun, Lee, Jae-Gil

arXiv.org Artificial IntelligenceDec-14-2023

We propose a novel framework DropTop that suppresses the shortcut bias in online continual learning (OCL) while being adaptive to the varying degree of the shortcut bias incurred by continuously changing environment. By the observed high-attention property of the shortcut bias, highly-activated features are considered candidates for debiasing. More importantly, resolving the limitation of the online environment where prior knowledge and auxiliary data are not ready, two novel techniques -- feature map fusion and adaptive intensity shifting -- enable us to automatically determine the appropriate level and proportion of the candidate shortcut features to be dropped. Extensive experiments on five benchmark datasets demonstrate that, when combined with various OCL algorithms, DropTop increases the average accuracy by up to 10.4% and decreases the forgetting by up to 63.2%.

droptop, intensity, shortcut feature, (13 more...)

arXiv.org Artificial Intelligence

2312.08677

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre:

Instructional Material > Online (0.70)
Research Report > Experimental Study (0.46)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Don't blame Dataset Shift! Shortcut Learning due to Gradients and Cross Entropy

Puli, Aahlad, Zhang, Lily, Wald, Yoav, Ranganath, Rajesh

arXiv.org Artificial IntelligenceAug-24-2023

Common explanations for shortcut learning assume that the shortcut improves prediction under the training distribution but not in the test distribution. Thus, models trained via the typical gradient-based optimization of cross-entropy, which we call default-ERM, utilize the shortcut. However, even when the stable feature determines the label in the training distribution and the shortcut does not provide any additional information, like in perception tasks, default-ERM still exhibits shortcut learning. Why are such solutions preferred when the loss for default-ERM can be driven to zero using the stable feature alone? By studying a linear perception task, we show that default-ERM's preference for maximizing the margin leads to models that depend more on the shortcut than the stable feature, even without overparameterization. This insight suggests that default-ERM's implicit inductive bias towards max-margin is unsuitable for perception tasks. Instead, we develop an inductive bias toward uniform margins and show that this bias guarantees dependence only on the perfect stable feature in the linear perception task. We develop loss functions that encourage uniform-margin solutions, called margin control (MARG-CTRL). MARG-CTRL mitigates shortcut learning on a variety of vision and language tasks, showing that better inductive biases can remove the need for expensive two-stage shortcut-mitigating methods in perception tasks.

accuracy, shortcut, stable feature, (16 more...)

arXiv.org Artificial Intelligence

2308.12553

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Benign Shortcut for Debiasing: Fair Visual Recognition via Intervention with Shortcut Features

Zhang, Yi, Sang, Jitao, Wang, Junyang, Jiang, Dongmei, Wang, Yaowei

arXiv.org Artificial IntelligenceAug-12-2023

Machine learning models often learn to make predictions that rely on sensitive social attributes like gender and race, which poses significant fairness risks, especially in societal applications, such as hiring, banking, and criminal justice. Existing work tackles this issue by minimizing the employed information about social attributes in models for debiasing. However, the high correlation between target task and these social attributes makes learning on the target task incompatible with debiasing. Given that model bias arises due to the learning of bias features (\emph{i.e}., gender) that help target task optimization, we explore the following research question: \emph{Can we leverage shortcut features to replace the role of bias feature in target task optimization for debiasing?} To this end, we propose \emph{Shortcut Debiasing}, to first transfer the target task's learning of bias attributes from bias features to shortcut features, and then employ causal intervention to eliminate shortcut features during inference. The key idea of \emph{Shortcut Debiasing} is to design controllable shortcut features to on one hand replace bias features in contributing to the target task during the training stage, and on the other hand be easily removed by intervention during the inference stage. This guarantees the learning of the target task does not hinder the elimination of bias features. We apply \emph{Shortcut Debiasing} to several benchmark datasets, and achieve significant improvements over the state-of-the-art debiasing methods in both accuracy and fairness.

artificial intelligence, machine learning, shortcut feature, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581783.3612317

2308.08482

Country:

North America > Canada > Ontario > National Capital Region > Ottawa (0.05)
Asia > China > Beijing > Beijing (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Industry: Law (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts

Chen, Annie S., Lee, Yoonho, Setlur, Amrith, Levine, Sergey, Finn, Chelsea

arXiv.org Artificial IntelligenceJun-19-2023

Effective machine learning models learn both robust features that directly determine the outcome of interest (e.g., an object with wheels is more likely to be a car), and shortcut features (e.g., an object on a road is more likely to be a car). The latter can be a source of error under distributional shift, when the correlations change at test-time. The prevailing sentiment in the robustness literature is to avoid such correlative shortcut features and learn robust predictors. However, while robust predictors perform better on worst-case distributional shifts, they often sacrifice accuracy on majority subpopulations. In this paper, we argue that shortcut features should not be entirely discarded. Instead, if we can identify the subpopulation to which an input belongs, we can adaptively choose among models with different strengths to achieve high performance on both majority and minority subpopulations. We propose COnfidence-baSed MOdel Selection (CosMoS), where we observe that model confidence can effectively guide model selection. Notably, CosMoS does not require any target labels or group annotations, either of which may be difficult to obtain or unavailable. We evaluate CosMoS on four datasets with spurious correlations, each with multiple test sets with varying levels of data distribution shift. We find that CosMoS achieves 2-5% lower average regret across all subpopulations, compared to using only robust predictors or other model aggregation methods.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.1112

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback