Maintaining Adversarial Robustness in Continuous Learning

Ru, Xiaolei, Cao, Xiaowei, Liu, Zijia, Moore, Jack Murdoch, Zhang, Xin-Ya, Zhu, Xia, Wei, Wenjia, Yan, Gang

arXiv.org Artificial Intelligence 

Continual learning and adversarial robustness are distinct and important research directions in artificial intelligence, each of which has witnessed significant advances. The former addresses a critical challenge known as catastrophic forgetting, where a neural network trained on a sequential of new tasks typically exhibits a dramatic drop in its performance on previously learned tasks if the model cannot revisit the previous data [1]. The latter focuses on developing defenses against adversarial attacks that can deceive models into confidently misclassifying objects by adding subtle targeted perturbations to the input images often imperceptible to human observers [2]. However, the evolution of the neural network's adversarial robustness in context of continuous learning remains underexplored. In our experiments, we observe that adversarial robustness enhanced by well-designed defense algorithms on previous data is easily lost when the neural network updates its weights to accommodate new tasks, resulting in a phenomenon similar to catastrophic forgetting. This presents an intriguing challenge: how can we maintain the adversarial robustness during continuous learning? In other words, the objective of continuous learning expands to concurrently encompass (classification) performance and adversarial robustness. In this paper, we present a solution by proposing a novel gradient projection technique called Double Gradient Projection (DGP), which inherently enables collaboration with a class of defense algorithms that enhance robustness through sample gradient smoothing. DGP is grounded on a theoretical hypothesis that a neural network's robustness can be maintained if the smoothness of sample gradients from previous data remain unchanged after weight updates.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found