Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense

Neural Information Processing Systems 

However, Does achieving a low ASR through current safety purification methods truly eliminate learned backdoor features from the pretraining phase? In this paper, we provide an affirmative answer to this question by thoroughly investigating the Post-Purification Robustness of current backdoor purification methods.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found