Learning to Defense by Learning to Attack
Chen, Zhehui, Jiang, Haoming, Dai, Bo, Zhao, Tuo
This decade has witnessed great breakthroughs in deep learning in a variety of applications, such as computer vision (Taigman et al., 2014; Girshick et al., 2014; He et al., 2016; Liu et al., 2017). Recent studies (Szegedy et al., 2013), however, show that most of these deep learning models are very vulnerable to adversarial attacks. Specifically, by injecting a small perturbation to the normal sample, attackers obtain the adversarial examples. Although these adversarial examples are semantically indistinguishable from the normal ones, they can severely fool the deep learning models and undermine the security of deep learning, causing reliability problems in autonomous driving, biometric authentication, etc. Researchers have devoted many effects to studying efficient adversarial attack and defense (Szegedy et al., 2013; Goodfellow et al., 2014b; Nguyen et al., 2015; Zheng et al., 2016; Madry et al., 2017). There is a growing body of work on generating successful adversarial examples, e.g., fast gradient sign method (FGSM, Goodfellow et al. (2014b)), projected gradient method (PGM, Kurakin et al. (2016)), etc. As for robustness, Goodfellow et al. (2014b) first propose to robustify the
Nov-3-2018