Confidence-aware Training of Smoothed Classifiers for Certified Robustness

Jeong, Jongheon, Kim, Seojin, Shin, Jinwoo

Dec-20-2022–arXiv.org Artificial Intelligence

Any classifier can be "smoothed out" under Gaussian noise to build a new classifier that is provably robust to $\ell_2$-adversarial perturbations, viz., by averaging its predictions over the noise via randomized smoothing. Under the smoothed classifiers, the fundamental trade-off between accuracy and (adversarial) robustness has been well evidenced in the literature: i.e., increasing the robustness of a classifier for an input can be at the expense of decreased accuracy for some other inputs. In this paper, we propose a simple training method leveraging this trade-off to obtain robust smoothed classifiers, in particular, through a sample-wise control of robustness over the training samples. We make this control feasible by using "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input. Specifically, we differentiate the training objective depending on this proxy to filter out samples that are unlikely to benefit from the worst-case (adversarial) objective. Our experiments show that the proposed method, despite its simplicity, consistently exhibits improved certified robustness upon state-of-the-art training methods. Somewhat surprisingly, we find these improvements persist even for other notions of robustness, e.g., to various types of common corruptions.

accuracy, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Dec-20-2022

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Ontario > Toronto (0.14)
- Asia
  - Middle East > Jordan (0.04)
  - South Korea > Daejeon
    - Daejeon (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (0.93)
  - Machine Learning > Neural Networks (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found