BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
–Neural Information Processing Systems
Our model is built using the pre-trained Stable Diffusion model trained on web-scraped datasets. Proper content moderation and regulation are highly advised to prevent undesirable consequence. In Figure 1, we outline common failure cases of the model. Subject images used for finetuning are shown on the left. We briefly introduce these methods below.
Neural Information Processing Systems
Feb-12-2026, 13:48:05 GMT