The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Avrahami, Omri, Hertz, Amir, Vinker, Yael, Arar, Moab, Fruchter, Shlomi, Fried, Ohad, Cohen-Or, Daniel, Lischinski, Dani
–arXiv.org Artificial Intelligence
Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, these models struggle with generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach. Project page is available at https://omriavrahami.com/the-chosen-one
arXiv.org Artificial Intelligence
Nov-27-2023
- Country:
- Pacific Ocean > North Pacific Ocean
- San Francisco Bay > Golden Gate (0.04)
- North America > United States
- New York (0.04)
- Europe > Italy
- Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia
- Middle East
- Saudi Arabia > Northern Borders Province
- Arar (0.04)
- Israel
- Tel Aviv District > Tel Aviv (0.04)
- Jerusalem District > Jerusalem (0.04)
- Saudi Arabia > Northern Borders Province
- Japan > Honshū
- Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Middle East
- Pacific Ocean > North Pacific Ocean
- Genre:
- Research Report (1.00)
- Questionnaire & Opinion Survey (0.70)
- Industry:
- Media (0.68)
- Information Technology > Software (0.34)
- Technology: