EP-CFG: Energy-Preserving Classifier-Free Guidance
Zhang, Kai, Luan, Fujun, Bi, Sai, Zhang, Jianming
–arXiv.org Artificial Intelligence
Classifier-free guidance (CFG) (Ho & Salimans, 2022) is widely used in diffusion models (Ho et al., 2020; Song et al., 2020) for text-guided generation, but often leads to over-contrast and oversaturation artifacts. We propose EP-CFG, a simple yet effective CFG solution that preserves the energy distribution of the conditional prediction while maintaining strong semantic alignment. Usually, the CFG strength is around 7-10 (Rombach et al., 2022) in modern text-to-image models for sampling high-quality visuals. However, it is wellknown that such high CFG strength can lead to the well-known over-contrast and over-saturation artifacts (Ho & Salimans, 2022). Concurrent work (Sadat et al., 2024) proposed APG to address oversaturation through update term decomposition.
arXiv.org Artificial Intelligence
Dec-13-2024
- Genre:
- Research Report (0.40)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (0.90)
- Vision (0.67)
- Information Technology > Artificial Intelligence