On the existence of consistent adversarial attacks in high-dimensional linear classification
Vilucchio, Matteo, Zdeborová, Lenka, Loureiro, Bruno
What fundamentally distinguishes an adversarial attack from a misclassification due to limited model expressivity or finite data? In this work, we investigate this question in the setting of high-dimensional binary classification, where statistical effects due to limited data availability play a central role. We introduce a new error metric that precisely capture this distinction, quantifying model vulnerability to consistent adversarial attacks -- perturbations that preserve the ground-truth labels. Our main technical contribution is an exact and rigorous asymptotic characterization of these metrics in both well-specified models and latent space models, revealing different vulnerability patterns compared to standard robust error measures. The theoretical results demonstrate that as models become more overparameterized, their vulnerability to label-preserving perturbations grows, offering theoretical insight into the mechanisms underlying model sensitivity to adversarial attacks.
Jun-17-2025
- Country:
- North America
- Europe
- Austria (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Switzerland > Vaud
- Lausanne (0.04)
- France > Île-de-France
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Technology: