AITopics | non-robust feature

Collaborating Authors

non-robust feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adversarial Examples Are Not Real Features

Neural Information Processing SystemsDec-24-2025, 15:02:53 GMT

The existence of adversarial examples has been a mystery for years and attracted much interest. A well-known theory by \citet{ilyas2019adversarial} explains adversarial vulnerability from a data perspective by showing that one can extract non-robust features from adversarial examples and these features alone are useful for classification. However, the explanation remains quite counter-intuitive since non-robust features are mostly noise features to humans. In this paper, we re-examine the theory from a larger context by incorporating multiple learning paradigms. Notably, we find that contrary to their good usefulness under supervised learning, non-robust features attain poor usefulness when transferred to other self-supervised learning paradigms, such as contrastive learning, masked image modeling, and diffusion models. It reveals that non-robust features are not really as useful as robust or natural features that enjoy good transferability between these paradigms. Meanwhile, for robustness, we also show that naturally trained encoders from robust features are largely non-robust under AutoAttack. Our cross-paradigm examination suggests that the non-robust features are not really useful but more like paradigm-wise shortcuts, and robust features alone might be insufficient to attain reliable model robustness.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck

Neural Information Processing SystemsDec-24-2025, 11:18:05 GMT

Adversarial examples, generated by carefully crafted perturbation, have attracted considerable attention in research fields. Recent works have argued that the existence of the robust and non-robust features is a primary cause of the adversarial examples, and investigated their internal interactions in the feature space. In this paper, we propose a way of explicitly distilling feature representation into the robust and non-robust features, using Information Bottleneck. Specifically, we inject noise variation to each feature unit and evaluate the information flow in the feature representation to dichotomize feature units either robust or non-robust, based on the noise variation magnitude. Through comprehensive experiments, we demonstrate that the distilled features are highly correlated with adversarial prediction, and they have human-perceptible semantic information by themselves. Furthermore, we present an attack mechanism intensifying the gradient of non-robust features that is directly related to the model prediction, and validate its effectiveness of breaking model robustness.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?

Neural Information Processing SystemsDec-24-2025, 11:17:06 GMT

The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon. Here, we study adversarial examples of trained neural networks through analytical tools afforded by recent theory advances connecting neural networks and kernel methods, namely the Neural Tangent Kernel (NTK), following a growing body of work that leverages the NTK approximation to successfully analyze important deep learning phenomena and design algorithms for new applications. We show how NTKs allow to generate adversarial examples in a lazy'' regime. We leverage this connection to provide an alternative view on robust and non-robust features, which have been suggested to underlie the adversarial brittleness of neural nets. Specifically, we define and study features induced by the eigendecomposition of the kernel to better understand the role of robust and non-robust features, the reliance on both for standard classification and the robustness-accuracy trade-off. We find that such features are surprisingly consistent across architectures, and that robust features tend to correspond to the largest eigenvalues of the model, and thus are learned early during training. Our framework allows us to identify and visualize non-robust yet useful features. Finally, we shed light on the robustness mechanism underlying adversarial training of neural nets used in practice: quantifying the evolution of the associated empirical NTK, we demonstrate that its dynamics falls much earlier into the ``lazy'' regime and manifests a much stronger form of the well known bias to prioritize learning features within the top eigenspaces of the kernel, compared to standard training.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adversarial Examples are not Bugs, they are Features Andrew Ilyas

Neural Information Processing SystemsNov-19-2025, 07:19:28 GMT

Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features (derived from patterns in the data distribution) that are highly predictive, yet brittle and (thus) incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data.

artificial intelligence, machine learning, non-robust feature, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > Canada (0.04)
Asia (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

inlined responses to the major points below, and will address all minor points in our next revision as well

Neural Information Processing SystemsNov-19-2025, 07:19:13 GMT

Finally, it is important to note that, in the end, our result is of existential nature: i.e., for the first time, we man-10

artificial intelligence, dataset, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

a94a8800a4b0af45600bab91164849df-Supplemental-Conference.pdf

Neural Information Processing SystemsNov-15-2025, 17:49:29 GMT

Supplementary Material: Can Adversarial Training Be Manipulated By Non-Robust Features? In this part, we discuss several independent (or concurrent) works that are closely related to this work. They also conclude that conventional adversarial training will prevent a drop in accuracy measured both on clean images and adversarial images. In contrast, we focus on a more realistic setting that does not require a larger attack budget. From this perspective, our work is complementary to theirs. This makes the threat of stability attacks more insidious than that of Fu et al. [19].

artificial intelligence, classifier, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

a94a8800a4b0af45600bab91164849df-Paper-Conference.pdf

Neural Information Processing SystemsNov-15-2025, 17:49:25 GMT

This defense ability, however, is challenged in this paper.

adversarial training, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
Asia > China > Chongqing Province > Chongqing (0.04)
Asia > Singapore (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviewer 1 1

Neural Information Processing SystemsNov-15-2025, 14:47:29 GMT

"straightforward" from simply looking at the equations, we maintain that the multi-layer extension is a significant However, note from Figure 5 (appendix) the pattern in which the layers are sequentially "added" by the We consider the direction of finding other optimizations for layer choice an important future work. From eqn 3, you are correct, it is possible for all layers to contribute differently. Intuitively, the most impactful layers are added first. The decoding for this layer notation is shown in Figure 5 (appendix). We will be sure to clarify these points in the final version.

artificial intelligence, machine learning, non-robust feature, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

378b284f7f03274d1bf5322bb15c5c16-Paper-Conference.pdf

Neural Information Processing SystemsNov-15-2025, 06:58:16 GMT

artificial intelligence, machine learning, non-robust feature, (17 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Overleaf Example

Neural Information Processing SystemsNov-15-2025, 02:16:32 GMT

Experiments show that the proposed ReBalanced Adversarial Training (ReBA T) can attain good robustness and does not suffer from robust overfitting even after very long training.

artificial intelligence, machine learning, non-robust feature, (17 more...)

Neural Information Processing Systems

Country: