MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models

Zhou, Donghao, Huang, Jiancheng, Bai, Jinbin, Wang, Jiaze, Chen, Hao, Chen, Guangyong, Hu, Xiaowei, Heng, Pheng-Ann

Dec-6-2024–arXiv.org Artificial Intelligence

Recent text-to-image models generate high-quality images from text prompts but lack precise control over specific components within visual concepts. Therefore, we introduce component-controllable personalization, a new task that allows users to customize and reconfigure individual components within concepts. This task faces two challenges: semantic pollution, where undesirable elements distort the concept, and semantic imbalance, which leads to disproportionate learning of the target concept and component. To address these, we design MagicTailor, a framework that uses Dynamic Masked Degradation to adaptively perturb unwanted visual semantics and Dual-Stream Balancing for more balanced learning of desired visual semantics. The experimental results show that MagicTailor outperforms existing methods in this task and enables more personalized, nuanced, and creative image generation.

artificial intelligence, machine learning, magictailor, (15 more...)

arXiv.org Artificial Intelligence

Dec-6-2024

arXiv.org PDF

Add feedback

Country:
- Asia (0.28)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.69)
  - Vision (1.00)