Robust Deep Learning Models Against Semantic-Preserving Adversarial Attack
Gao, Dashan, Zhao, Yunce, Yao, Yinghua, Zhang, Zeqi, Mao, Bifei, Yao, Xin
–arXiv.org Artificial Intelligence
Deep learning models can be fooled by small $l_p$-norm adversarial perturbations and natural perturbations in terms of attributes. Although the robustness against each perturbation has been explored, it remains a challenge to address the robustness against joint perturbations effectively. In this paper, we study the robustness of deep learning models against joint perturbations by proposing a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training. Specifically, we introduce an attribute manipulator to generate natural and human-comprehensible perturbations and a noise generator to generate diverse adversarial noises. Based on such combined noises, we optimize both the attribute value and the diversity variable to generate jointly-perturbed samples. For robust training, we adversarially train the deep learning model against the generated joint perturbations. Empirical results on four benchmarks show that the SPA attack causes a larger performance decline with small $l_{\infty}$ norm-ball constraints compared to existing approaches. Furthermore, our SPA-enhanced training outperforms existing defense methods against such joint perturbations.
arXiv.org Artificial Intelligence
Apr-8-2023
- Country:
- Asia
- Middle East > Jordan (0.04)
- China
- Guangdong Province > Shenzhen (0.05)
- Hong Kong (0.04)
- Asia
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology > Security & Privacy (0.52)
- Government > Military (0.42)
- Technology: