Towards Safe Concept Transfer of Multi-Modal Diffusion via Causal Representation Editing

May-26-2025, 17:16:55 GMT–Neural Information Processing Systems

Recent advancements in vision-language-to-image (VL2I) diffusion generation have made significant progress. While generating images from broad vision-language inputs holds promise, it also raises concerns about potential misuse, such as copying artistic styles without permission, which could have legal and social consequences. To address these issues, researchers have explored various approaches, such as dataset filtering, adversarial perturbations, machine unlearning, and inference-time refusals. However, these methods often lack either scalability or effectiveness. In response, we propose a new framework called causal representation editing (CRE), which extends representation editing from large language models (LLMs) to diffusion-based models. CRE enhances the efficiency and flexibility of safe content generation by intervening at diffusion timesteps causally linked to unsafe concepts.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

May-26-2025, 17:16:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)