Generate (non-software) Bugs to Fool Classifiers
Yakura, Hiromu, Akimoto, Youhei, Sakuma, Jun
Let us consider a scenario in which an attacker wishes to modify input image x so that the target model f classifies it with the specific label t . The generation process can be represented as follows: ˆ v argmin v L f ( x v,t) null nullv null, (1) where L f denotes a loss function that represents how distant the input data are from the given label under f and v null null v null is a norm function to regularize the perturbation so that v becomes unnoticeable to humans. Then, x ˆ v is expected to form an adversarial example that is classified as t while it looks similar to x . Earlier approaches, such as Szegedy et al. (2014) and Moosavi-Dezfooli, Fawzi, and Frossard (2016), used L 2-norm to limit the magnitude of the perturbation. In contrast, Su, V argas, and Sakurai (2017) used L 0-norm to limit the number of modified pixels and showed that even modification of a one-pixel could generate adversarial examples. More recent studies introduced GAN instead of directly optimizing perturbations (Xiao et al. 2018; Zhao, Dua, and Singh 2018) for the purpose of ensuring the naturalness of adversarial examples. For example, Xiao et al. (2018) trained a discriminator network to distinguish adversarial examples from natural images so that the generator network produced adversarial examples that appeared as natural images. Given the distribution p x over the natural images and the tradeoff parameter α, its training process can be represented similarly to that in Goodfellow et al. (2014) as follows: min G max D E x p x[log D (x)] E x p x[log (1 D ( x G ( x)))] α E x p x[L f (x G (x),t)] .
Nov-19-2019
- Country:
- North America
- Costa Rica (0.04)
- United States
- Europe
- Switzerland (0.04)
- France (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Asia > Japan
- Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
- North America
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: