anti-dreambooth
Mitigating Semantic Collapse in Generative Personalization with Test-Time Embedding Adjustment
Bui, Anh, Vu, Trang, Le, Trung, Kim, Junae, Abraham, Tamas, Omari, Rollin, Kaur, Amar, Phung, Dinh
In this paper, we investigate the semantic collapsing problem in generative personalization, an under-explored topic where the learned visual concept ($V$) gradually shifts from its original textual meaning and comes to dominate other concepts in multi-concept input prompts. This issue not only reduces the semantic richness of complex input prompts like "a photo of $V$ wearing glasses and playing guitar" into simpler, less contextually rich forms such as "a photo of $V$" but also leads to simplified output images that fail to capture the intended concept. We identify the root cause as unconstrained optimisation, which allows the learned embedding $V$ to drift arbitrarily in the embedding space, both in direction and magnitude. To address this, we propose a simple yet effective training-free method that adjusts the magnitude and direction of pre-trained embedding at inference time, effectively mitigating the semantic collapsing problem. Our method is broadly applicable across different personalization methods and demonstrates significant improvements in text-image alignment in diverse use cases. Our code is anonymously published at https://github.com/tuananhbui89/Embedding-Adjustment
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Oceania > Australia (0.04)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- Asia > China > Hong Kong (0.04)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.46)
Four ways to protect your art from AI
Artists and writers have launched several lawsuits against AI companies, arguing that their work has been scraped into databases for training AI models without consent or compensation. Tech companies have responded that anything on the public internet falls under fair use. But it will be years until we have a legal resolution to the problem. Unfortunately, there is little you can do if your work has been scraped into a data set and used in a model that is already out there. You can, however, take steps to prevent your work from being used in the future.
- Information Technology (0.76)
- Law > Litigation (0.59)
Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models
Xu, Jingyao, Lu, Yuetong, Li, Yandong, Lu, Siyang, Wang, Dongdong, Wei, Xiang
Diffusion models (DMs) embark a new era of generative modeling and offer more opportunities for efficient generating high-quality and realistic data samples. However, their widespread use has also brought forth new challenges in model security, which motivates the creation of more effective adversarial attackers on DMs to understand its vulnerability. We propose CAAT, a simple but generic and efficient approach that does not require costly training to effectively fool latent diffusion models (LDMs). The approach is based on the observation that cross-attention layers exhibits higher sensitivity to gradient change, allowing for leveraging subtle perturbations on published images to significantly corrupt the generated images. We show that a subtle perturbation on an image can significantly impact the cross-attention layers, thus changing the mapping between text and image during the fine-tuning of customized diffusion models. Extensive experiments demonstrate that CAAT is compatible with diverse diffusion models and outperforms baseline attack methods in a more effective (more noise) and efficient (twice as fast as Anti-DreamBooth and Mist) manner.
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis
Van Le, Thanh, Phung, Hao, Nguyen, Thuan Hoang, Dao, Quan, Tran, Ngoc, Tran, Anh
Text-to-image diffusion models are nothing but a revolution, allowing anyone, even without design skills, to create realistic images from simple text inputs. With powerful personalization tools like DreamBooth, they can generate images of a specific person just by learning from his/her few reference images. However, when misused, such a powerful and convenient tool can produce fake news or disturbing content targeting any individual victim, posing a severe negative social impact. In this paper, we explore a defense system called Anti-DreamBooth against such malicious use of DreamBooth. The system aims to add subtle noise perturbation to each user's image before publishing in order to disrupt the generation quality of any DreamBooth model trained on these perturbed images. We investigate a wide range of algorithms for perturbation optimization and extensively evaluate them on two facial datasets over various text-to-image model versions. Despite the complicated formulation of DreamBooth and Diffusion-based text-to-image models, our methods effectively defend users from the malicious use of those models. Their effectiveness withstands even adverse conditions, such as model or prompt/term mismatching between training and testing. Our code will be available at https://github.com/VinAIResearch/Anti-DreamBooth.git.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Security & Privacy (0.69)
- Media (0.66)