Provable Robustness of Adversarial Training for Learning Halfspaces with Noise
Zou, Difan, Frei, Spencer, Gu, Quanquan
Modern deep learning models are powerful but brittle: standard stochastic gradient descent (SGD) training of deep neural networks can lead to remarkable performance as measured by the classification accuracy on the test set, but this performance rapidly degrades if the metric is instead adversarially robust accuracy. This brittleness is most apparent for image classification tasks (Szegedy et al., 2014; Goodfellow et al., 2015), where neural networks trained by gradient descent achieve state-of-the-art classification accuracy on a number of benchmark tasks, but where imperceptible (adversarial) perturbations of an image can force the neural network to get nearly all of its predictions incorrect. To formalize the above comment, let us define the robust error of a classifier.
Apr-19-2021
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.50)
- Technology: