Rademacher Complexity for Adversarially Robust Generalization
Yin, Dong, Ramchandran, Kannan, Bartlett, Peter
In recent years, many modern machine learning models, in particular, deep neural networks, have achieved success in tasks such as image classification [31, 25], speech recognition [23], machine translation [5], game playing [45], etc. However, although these models achieve the state-of-the-art performance in many standard benchmarks or competitions, it has been observed that by adversarially adding some perturbation to the input of the model (images, audio signals), the machine learning models can make wrong predictions with high confidence. These adversarial inputs are often called the adversarial examples. Typical methods of generating adversarial examples include adding small perturbations that are imperceptible to humans [48], changing surrounding areas of the main objects in images [19], and even simple rotation and translation [16]. This phenomenon was first discovered by Szegedy et al. [48] in image classification problems, and similar phenomena have been observed in other areas [13, 30]. Adversarial examples bring serious challenges in many security-critical applications, such as medical diagnosis and autonomous driving--the existence of these examples shows that many state-of-the-art machine learning models are actually unreliable in the presence of adversarial attacks. Since the discovery of adversarial examples, there has been a race between designing robust models that can defend against adversarial attacks and designing attack algorithms that can generate adversarial examples and fool the machine learning models [22, 24, 11, 12].
Nov-7-2018