Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models

Neural Information Processing Systems 

Therefore, if we could properly restrict the PLM's adaptation to the moderate-fitting stage, the model would neglect the backdoor triggers but still achieve satisfying performance on the original task.