Understanding and Improving Adversarial Attacks on Latent Diffusion Model

Zheng, Boyang, Liang, Chumeng, Wu, Xiaoyu, Liu, Yan

arXiv.org Artificial Intelligence 

Adversarial attacks on LDM are then born to protect unauthorized images from being used in LDMdriven few-shot generation. However, these attacks suffer from moderate performance and excessive computational cost, especially in GPU memory. In this paper, we propose an effective adversarial attack on LDM that shows superior performance against state-of-the-art few-shot generation pipeline of LDM, for example, LoRA. We implement the attack with memory efficiency by introducing several mechanisms and decrease the memory cost of the attack to less than 6GB, which allows individual users to run the attack on a majority of consumer GPUs. The adversarial budget is 4/255. Diffusion models (Sohl-Dickstein et al., 2015; Song & Ermon, 2019; Ho et al., 2020; Song et al., 2020) have long held the promise of producing fine-grained content that could resemble real data. Figure 2: Few-shot generation based on adversarial examples outputs low-quality images. in few-shot generation--generating data with few-shot reference data--has pushed the state-of-theart performance forward by a significant margin and sparked a craze for AI-generated art (Meng et al., 2021; Gal et al., 2022; Ruiz et al., 2023; Roich et al., 2022; Zhang & Agrawala, 2023). While the opportunities presented by LDM are immense, the implications of its power are a doubleedged sword. Malicious individuals leverage LDM-driven few-shot generation to copy artworks without authorization (Fan et al., 2023) and create fake not-suitable-for-work photos with personal figures (Wang et al., 2023b). Such malevolent applications of LDM threaten the sanctity of personal data and intellectual property. Recognizing the need, adversarial attacks on LDM were born as countermeasures (Salman et al., 2023; Liang et al., 2023; Shan et al., 2023; Van Le et al., 2023). These attacks add human-invisible perturbations to the real image and transfer it to an adversarial example, making it unusable in LDM-driven few-shot generation. Applications based on these adversarial attacks (Liang & Wu, 2023; Shan et al., 2023) serve as a tool to protect personal images from being used as reference data for LDM-driven few-shot generation. However, existing adversarial attacks on LDM suffer from moderate effectiveness.