AITopics | pgd-at

Collaborating Authors

pgd-at

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ExploringAdversarialRobustnessofDeepState SpaceModels

Neural Information Processing SystemsFeb-7-2026, 19:04:40 GMT

However, its effectiveness in improving the AR of SSMs remains unclear.

artificial intelligence, linear, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

537d9b6c927223c796cac288cced29df-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 22:51:40 GMT

artificial intelligence, cifar10, trade-off, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

How Do Training Methods Influence the Utilization of Vision Models?

Gavrikov, Paul, Agnihotri, Shashank, Keuper, Margret, Keuper, Janis

arXiv.org Artificial IntelligenceOct-18-2024

Not all learnable parameters (e.g., weights) contribute equally to a neural network's decision function. In fact, entire layers' parameters can sometimes be reset to random values with little to no impact on the model's decisions. We revisit earlier studies that examined how architecture and task complexity influence this phenomenon and ask: is this phenomenon also affected by how we train the model? We conducted experimental evaluations on a diverse set of ImageNet-1k classification models to explore this, keeping the architecture and training data constant but varying the training pipeline. Our findings reveal that the training method strongly influences which layers become critical to the decision function for a given task. For example, improved training regimes and self-supervised training increase the importance of early layers while significantly under-utilizing deeper layers. In contrast, methods such as adversarial training display an opposite trend. Our preliminary results extend previous findings, offering a more nuanced understanding of the inner mechanics of neural networks. Code: https://github.com/paulgavrikov/layer_criticality

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.1447

Country: Europe > Germany > Saarland (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Rethinking Adversarial Training with Neural Tangent Kernel

Li, Guanlin, Qiu, Han, Guo, Shangwei, Li, Jiwei, Zhang, Tianwei

arXiv.org Artificial IntelligenceDec-4-2023

Adversarial training (AT) is an important and attractive topic in deep learning security, exhibiting mysteries and odd properties. Recent studies of neural network training dynamics based on Neural Tangent Kernel (NTK) make it possible to reacquaint AT and deeply analyze its properties. In this paper, we perform an in-depth investigation of AT process and properties with NTK, such as NTK evolution. We uncover three new findings that are missed in previous works. First, we disclose the impact of data normalization on AT and the importance of unbiased estimators in batch normalization layers. Second, we experimentally explore the kernel dynamics and propose more time-saving AT methods. Third, we study the spectrum feature inside the kernel to address the catastrophic overfitting problem. To the best of our knowledge, it is the first work leveraging the observations of kernel dynamics to improve existing AT methods.

accuracy, kernel, neural network, (16 more...)

arXiv.org Artificial Intelligence

2312.02236

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Alleviating the Effect of Data Imbalance on Adversarial Training

Li, Guanlin, Xu, Guowen, Zhang, Tianwei

arXiv.org Artificial IntelligenceDec-4-2023

In this paper, we study adversarial training on datasets that obey the long-tailed distribution, which is practical but rarely explored in previous works. Compared with conventional adversarial training on balanced datasets, this process falls into the dilemma of generating uneven adversarial examples (AEs) and an unbalanced feature embedding space, causing the resulting model to exhibit low robustness and accuracy on tail data. To combat that, we theoretically analyze the lower bound of the robust risk to train a model on a long-tailed dataset to obtain the key challenges in addressing the aforementioned dilemmas. Based on it, we propose a new adversarial training framework -- Re-balancing Adversarial Training (REAT). This framework consists of two components: (1) a new training strategy inspired by the effective number to guide the model to generate more balanced and informative AEs; (2) a carefully constructed penalty function to force a satisfactory feature space. Evaluation results on different datasets and model structures prove that REAT can effectively enhance the model's robustness and preserve the model's clean accuracy. The code can be found in https://github.com/GuanlinLee/REAT.

adversarial training, dataset, tail class, (17 more...)

arXiv.org Artificial Intelligence

2307.10205

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

On robust overfitting: adversarial training induced distribution matters

Tian, Runzhi, Mao, Yongyi

arXiv.org Artificial IntelligenceNov-28-2023

Despite their outstanding performance, deep neural networks (DNNs) are known to be vulnerable to adversarial attacks where a carefully designed perturbation may cause the network to make a wrong prediction [1, 2]. Many methods have been proposed to improve the robustness of DNNs against adversarial perturbations [3, 4, 5], among which Projected Gradient Descend based Adversarial Training (PGD-AT) [3] is arguably the most effective [6, 7]. A recent work in [8] however revealed a surprising phenomenon in PGD-AT: after training, even though the robust error (i.e., error probability in the predicted label for adversarially perturbed instances) is nearly zero on the training set, it may remain very high on the testing set. For example, on the testing set of CIFAR10, the robust error of PGD-AT trained model can be as large as 44.19%. This significantly contrasts the standard training: on CIFAR10, when the standard error (i.e., the error probability in the predicted label for non-perturbed instances) is nearly zero on the training set, its value on the testing set is only about 4%.

adversarial training, generalization, pgd-at, (14 more...)

arXiv.org Artificial Intelligence

2311.16526

Country:

North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Enhance Diffusion to Improve Robust Generalization

Sun, Jianhui, Sinha, Sanchit, Zhang, Aidong

arXiv.org Artificial IntelligenceAug-17-2023

Deep neural networks are susceptible to human imperceptible adversarial perturbations. One of the strongest defense mechanisms is \emph{Adversarial Training} (AT). In this paper, we aim to address two predominant problems in AT. First, there is still little consensus on how to set hyperparameters with a performance guarantee for AT research, and customized settings impede a fair comparison between different model designs in AT research. Second, the robustly trained neural networks struggle to generalize well and suffer from tremendous overfitting. This paper focuses on the primary AT framework - Projected Gradient Descent Adversarial Training (PGD-AT). We approximate the dynamic of PGD-AT by a continuous-time Stochastic Differential Equation (SDE), and show that the diffusion term of this SDE determines the robust generalization. An immediate implication of this theoretical finding is that robust generalization is positively correlated with the ratio between learning rate and batch size. We further propose a novel approach, \emph{Diffusion Enhanced Adversarial Training} (DEAT), to manipulate the diffusion term to improve robust generalization with virtually no extra computational burden. We theoretically show that DEAT obtains a tighter generalization bound than PGD-AT. Our empirical investigation is extensive and firmly attests that DEAT universally outperforms PGD-AT by a significant margin.

artificial intelligence, generalization, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3580305.3599333

2306.02618

Country:

North America > United States > California > Los Angeles County > Long Beach (0.15)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Virginia (0.05)
(14 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Catastrophic overfitting can be induced with discriminative non-robust features

Ortiz-Jiménez, Guillermo, de Jorge, Pau, Sanyal, Amartya, Bibi, Adel, Dokania, Puneet K., Frossard, Pascal, Rogéz, Gregory, Torr, Philip H. S.

arXiv.org Artificial IntelligenceAug-15-2023

Adversarial training (AT) is the de facto method for building robust neural networks, but it can be computationally expensive. To mitigate this, fast single-step attacks can be used, but this may lead to catastrophic overfitting (CO). This phenomenon appears when networks gain non-trivial robustness during the first stages of AT, but then reach a breaking point where they become vulnerable in just a few iterations. The mechanisms that lead to this failure mode are still poorly understood. In this work, we study the onset of CO in single-step AT methods through controlled modifications of typical datasets of natural images. In particular, we show that CO can be induced at much smaller $\epsilon$ values than it was observed before just by injecting images with seemingly innocuous features. These features aid non-robust classification but are not enough to achieve robustness on their own. Through extensive experiments we analyze this novel phenomenon and discover that the presence of these easy features induces a learning shortcut that leads to CO. Our findings provide new insights into the mechanisms of CO and improve our understanding of the dynamics of AT. The code to reproduce our experiments can be found at https://github.com/gortizji/co_features.

accuracy, dataset, perturbation, (16 more...)

arXiv.org Artificial Intelligence

2206.08242

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Exploring Memorization in Adversarial Training

Dong, Yinpeng, Xu, Ke, Yang, Xiao, Pang, Tianyu, Deng, Zhijie, Su, Hang, Zhu, Jun

arXiv.org Machine LearningJun-3-2021

It is well known that deep learning models have a propensity for fitting the entire training set even with random labels, which requires memorization of every training sample. In this paper, we investigate the memorization effect in adversarial training (AT) for promoting a deeper understanding of capacity, convergence, generalization, and especially robust overfitting of adversarially trained classifiers. We first demonstrate that deep networks have sufficient capacity to memorize adversarial examples of training data with completely random labels, but not all AT algorithms can converge under the extreme circumstance. Our study of AT with random labels motivates further analyses on the convergence and generalization of AT. We find that some AT methods suffer from a gradient instability issue, and the recently suggested complexity measures cannot explain robust generalization by considering models trained on random labels. Furthermore, we identify a significant drawback of memorization in AT that it could result in robust overfitting. We then propose a new mitigation algorithm motivated by detailed memorization analyses. Extensive experiments on various datasets validate the effectiveness of the proposed method.

adversarial example, international conference, random label, (13 more...)

arXiv.org Machine Learning

2106.01606

Country: