MemLoss: Enhancing Adversarial Training with Recycling Adversarial Examples

Mahdi, Soroush, Amirmazlaghani, Maryam, Saravani, Saeed, Dehghanian, Zahra

arXiv.org Artificial Intelligence 

Szegedy et al. [1] were the first to demonstrate that small, imperceptible perturbations to input data can lead neural networks to make incorrect predictions with high confidence. This discovery exposed a significant vulnerability in machine learning models and introduced the concept of adversarial attacks. In recent years, the vulnerability of deep learning models to adversarial attacks has driven significant research into improving model robustness [1, 2]. Adversarial training, widely regarded as the most prominent defense against adversarial machine learning (AML) attacks, enhances model robustness by incorporating both benign and adversarial examples into the training process [3]. However, it often leads to reduced accuracy on clean data [4].