Adversarial attacks hidden in plain sight
Göpfert, Jan Philip, Wersing, Heiko, Hammer, Barbara
The use of convolutional neural networks has led to tremendous achievements since Krizhevsky et al. [1] presented AlexNet in 2012. Despite efforts to understand the inner workings of such neural networks, they mostly remain black boxes that are hard to interpret or explain. The issue was exaggerated in 2013 when Szegedy et al. [2] showed that "adversarial examples" - images perturbed in such a way that they fool a neural network - prove that neural networks do not simply work correctly the way one might naïvely expect. Typically,such adversarial attacks change an input only slightly, but in an adversarial manner, such that humans would not regard the difference of the inputs relevant, but machines do. There are various types of attacks, such as one pixel attacks, attacks that work in the physical world, and attacks that produce inputs fooling several different neural networks without explicit knowledge of those networks [3, 4, 5].
Feb-25-2019
- Country:
- Europe (0.14)
- Genre:
- Research Report (0.40)
- Industry:
- Government > Military (0.75)
- Information Technology > Security & Privacy (0.89)
- Technology: