Reviews: Towards Robust Detection of Adversarial Examples
–Neural Information Processing Systems
This paper proposes a combination of two modifications to make neural networks robust to adversarial examples: (1) reverse cross-entropy training allows the neural network to learn to better estimate its confidence in the output, as opposed to standard cross-entropy training, and (2) a kernel-density based detector detects whether or not the input appears to be adversarial, and rejects the inputs that appear adversarial. The authors appear to perform a proper evaluation of their defense, and argue that it is robust to the attacker who performs a white-box evaluation and optimizes for evading the defense. The defense does not claim to perfectly solve the problem of adversarial examples, but the results appear to be correctly verified. As shown in Figure 3, the adversarial examples on the proposed defense are visually distinguishable from the clean images. It is slightly unclear what is meant by "ratio" in Table 3.
Neural Information Processing Systems
Oct-8-2024, 07:26:23 GMT
- Technology: