How Robust Are Energy-Based Models Trained With Equilibrium Propagation?

Mansingh, Siddharth, Kucer, Michal, Kenyon, Garrett, Moore, Juston, Teti, Michael

arXiv.org Artificial Intelligence 

Deep neural networks (DNNs) are easily fooled by adversarial perturbations that are imperceptible to humans. Adversarial training, a process where adversarial examples are added to the training set, is the current state-of-the-art defense against adversarial attacks, but it lowers the model's accuracy on clean inputs, is computationally expensive, and offers less robustness to natural noise. In contrast, energy-based models (EBMs), which were designed for efficient implementation in neuromorphic hardware and physical systems, incorporate feedback connections from each layer to the previous layer, yielding a recurrent, deep-attractor architecture which we hypothesize should make them naturally robust. Our work is the first to explore the robustness of EBMs to both natural corruptions and adversarial attacks, which we do using the CIFAR-10 and CIFAR-100 datasets. We demonstrate that EBMs are more robust than transformers and display comparable robustness to adversarially-trained DNNs on gradient-based (white-box) attacks, query-based (black-box) attacks, and natural perturbations without sacrificing clean accuracy, and without the need for adversarial training or additional training techniques. Deep neural networks (DNNs) are easily fooled by carefully crafted perturbations (i.e., adversarial attacks) that are imperceptible to humans Szegedy et al. (2014); Carlini & Wagner (2017); Madry et al. (2017), as well as natural noise Hendrycks & Dietterich (2019). Adversarial training, a process which involves training on adversarial examples, is the current state-of-the-art defense against adversarial attacks Madry et al. (2017). However, adversarial training is computationally expensive and also leads to a drop in accuracy on clean/unperturbed test data Tsipras et al. (2018), a wellestablished tradeoff that has been described theoretically Schmidt et al. (2018); Zhang et al. (2019) and observed experimentally Stutz et al. (2019); Raghunathan et al. (2019). Moreover, adversariallytrained models overfit to the attack they are trained with and perform poorly under different attacks Wang et al. (2020), as well as natural noise/corruptions.