Towards Safe Concept Transfer of Multi-Modal Diffusion via Causal Representation Editing

Neural Information Processing Systems 

Recent advancements in vision-language-to-image (VL2I) diffusion generation have made significant progress. While generating images from broad vision-language inputs holds promise, it also raises concerns about potential misuse, such as copying artistic styles without permission, which could have legal and social consequences.