Adversarial Training Should Be Cast as a Non-Zero-Sum Game

Robey, Alexander, Latorre, Fabian, Pappas, George J., Hassani, Hamed, Cevher, Volkan

Jun-19-2023–arXiv.org Artificial Intelligence

One prominent approach toward resolving the adversarial vulnerability of deep neural networks is the two-player zero-sum paradigm of adversarial training, in which predictors are trained against adversarially-chosen perturbations of data. Despite the promise of this approach, algorithms based on this paradigm have not engendered sufficient levels of robustness, and suffer from pathological behavior like robust overfitting. To understand this shortcoming, we first show that the commonly used surrogate-based relaxation used in adversarial training algorithms voids all guarantees on the robustness of trained classifiers. The identification of this pitfall informs a novel non-zero-sum bilevel formulation of adversarial training, wherein each player optimizes a different objective function. Our formulation naturally yields a simple algorithmic framework that matches and in some cases outperforms state-of-the-art attacks, attains comparable levels of robustness to standard adversarial training algorithms, and does not suffer from robust overfitting.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

Jun-19-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Pennsylvania (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Czechia > Prague (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Switzerland > Zürich
    - Zürich (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.94)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found