On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning

Su, Yongyi, Li, Yushu, Liu, Nanqing, Jia, Kui, Yang, Xulei, Foo, Chuan-Sheng, Xu, Xun

arXiv.org Artificial Intelligence 

Test-time adaptation (TTA) updates the model weights during the inference stage using testing data to enhance generalization. Existing studies have shown that when TTA is updated with crafted adversarial test samples, also known as test-time poisoned data, the performance on benign samples can deteriorate. Nonetheless, the perceived adversarial risk may be overstated if the poisoned data is generated under overly strong assumptions. We then propose an effective and realistic attack method that better produces poisoned samples without access to benign samples, and derive an effective in-distribution attack objective. Our benchmarks of existing attack methods reveal that the TTA methods are more robust than previously believed. In addition, we analyze effective defense strategies to help develop adversarially robust TTA methods. Test-time adaptation (TTA) emerges as an effective measure to counter distribution shift at inference stage (Wang et al., 2020; Liu et al., 2021; Su et al., 2022; Song et al., 2023). Successful TTA methods leverage the testing data samples for self-training (Wang et al., 2020; Su et al., 2024b), distribution alignment Su et al. (2022); Liu et al. (2021) or prompt tuning (Gao et al., 2022). Consequently, this task is also referred to as Test-Time Data Poisoning (TTDP). The pioneering work DIA (Wu et al., 2023) introduced a poisoning approach by crafting malicious data with access to all benign samples within a minibatch, leveraging realtime model weights for explicit gradient computing, i.e., a white-box attack.