EP-CFG: Energy-Preserving Classifier-Free Guidance

Zhang, Kai, Luan, Fujun, Bi, Sai, Zhang, Jianming

arXiv.org Artificial Intelligence 

Classifier-free guidance (CFG) (Ho & Salimans, 2022) is widely used in diffusion models (Ho et al., 2020; Song et al., 2020) for text-guided generation, but often leads to over-contrast and oversaturation artifacts. We propose EP-CFG, a simple yet effective CFG solution that preserves the energy distribution of the conditional prediction while maintaining strong semantic alignment. Usually, the CFG strength is around 7-10 (Rombach et al., 2022) in modern text-to-image models for sampling high-quality visuals. However, it is wellknown that such high CFG strength can lead to the well-known over-contrast and over-saturation artifacts (Ho & Salimans, 2022). Concurrent work (Sadat et al., 2024) proposed APG to address oversaturation through update term decomposition.