Exposing Backdoors in Robust Machine Learning Models
Soremekun, Ezekiel, Udeshi, Sakshi, Chattopadhyay, Sudipta, Zeller, Andreas
The introduction of robust optimisation has pushed the state-of-the-art in defending against adversarial attacks. However, the behaviour of such optimisation has not been studied in the light of a fundamentally different class of attacks called backdoors. In this paper, we demonstrate that adversarially robust models are susceptible to backdoor attacks. Subsequently, we observe that backdoors are reflected in the feature representation of such models. Then, this is leveraged to detect backdoor-infected models. Specifically, we use feature clustering to effectively detect backdoor-infected robust Deep Neural Networks (DNNs). In our evaluation of major classification tasks, our approach effectively detects robust DNNs infected with backdoors. Our investigation reveals that salient features of adversarially robust DNNs break the stealthy nature of backdoor attacks.
Feb-24-2020
- Country:
- North America
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Santa Clara County > San Jose (0.04)
- San Diego County > San Diego (0.04)
- Louisiana > Orleans Parish
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.05)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- United States
- Europe
- Asia
- Singapore (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America
- Genre:
- Research Report (0.82)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: