Constrained Optimization with Dynamic Bound-scaling for Effective NLPBackdoor Defense

Shen, Guangyu, Liu, Yingqi, Tao, Guanhong, Xu, Qiuling, Zhang, Zhuo, An, Shengwei, Ma, Shiqing, Zhang, Xiangyu

Feb-11-2022–arXiv.org Artificial Intelligence

We develop a novel optimization method for NLP backdoor inversion. We leverage a dynamically reducing temperature coefficient in the softmax function to provide changing loss landscapes to the optimizer such that the process gradually focuses on the ground truth trigger, which is denoted as a one-hot value in a convex hull. Our method also features a temperature rollback mechanism to step away from local optimals, exploiting the observation that local optimals can be easily determined in NLP trigger inversion (while not in general optimization). We evaluate the technique on over 1600 models (with roughly half of them having injected backdoors) on 3 prevailing NLP tasks, Figure 1: Difficult loss landscape with a low temperature with 4 different backdoor attacks and 7 architectures. Our results show that the technique is able simply extended to NLP models due to the discrete nature of to effectively and efficiently detect and remove these models.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Feb-11-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States > Indiana > Tippecanoe County (0.14)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Text Processing (0.67)
  - Representation & Reasoning > Optimization (1.00)