Adapting to Evolving Adversaries with Regularized Continual Robust Training

Dai, Sihui, Cianfarani, Christian, Bhagoji, Arjun, Sehwag, Vikash, Mittal, Prateek

Feb-6-2025–arXiv.org Artificial Intelligence

Robust training methods typically defend against specific attack types, such as Lp attacks with fixed budgets, and rarely account for the fact that defenders may encounter new attacks over time. A natural solution is to adapt the defended model to new adversaries as they arise via fine-tuning, a method which we call continual robust training (CRT). However, when implemented naively, fine-tuning on new attacks degrades robustness on previous attacks. This raises the question: how can we improve the initial training and fine-tuning of the model to simultaneously achieve robustness against previous and new attacks? We present theoretical results which show that the gap in a model's robustness against different attacks is bounded by how far each attack perturbs a sample in the model's logit space, suggesting that regularizing with respect to this logit space distance can help maintain robustness against previous attacks. Extensive experiments on 3 datasets (CIFAR-10, CIFAR-100, and ImageNette) and over 100 attack combinations demonstrate that the proposed regularization improves robust accuracy with little overhead in training time. Our findings and open-source code lay the groundwork for the deployment of models robust to evolving attacks.

artificial intelligence, machine learning, regularization, (16 more...)

arXiv.org Artificial Intelligence

Feb-6-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)
- Asia > Middle East
  - Jordan (0.04)
- Europe
  - Austria (0.04)
  - United Kingdom > England
    - North Yorkshire > York (0.04)
- North America
  - Canada
    - Alberta > Census Division No. 15
      - Improvement District No. 9 > Banff (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
  - United States
    - California
      - Los Angeles County > Long Beach (0.04)
      - Orange County > Anaheim (0.04)
    - Illinois > Cook County
      - Chicago (0.04)
    - Utah > Salt Lake County
      - Salt Lake City (0.04)

Genre:
- Research Report > New Finding (0.87)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.92)
  - Security & Privacy (1.00)