Alignment of Diffusion Model and Flow Matching for Text-to-Image Generation

Ouyang, Yidong, Xie, Liyan, Zha, Hongyuan, Cheng, Guang

Feb-3-2026–arXiv.org Machine Learning

Diffusion models and flow matching have demonstrated remarkable success in text-to-image generation. While many existing alignment methods primarily focus on fine-tuning pre-trained generative models to maximize a given reward function, these approaches require extensive computational resources and may not generalize well across different objectives. In this work, we propose a novel alignment framework by leveraging the underlying nature of the alignment problem -- sampling from reward-weighted distributions -- and show that it applies to both diffusion models (via score guidance) and flow matching models (via velocity guidance). The score function (velocity field) required for the reward-weighted distribution can be decomposed into the pre-trained score (velocity field) plus a conditional expectation of the reward. For the alignment on the diffusion model, we identify a fundamental challenge: the adversarial nature of the guidance term can introduce undesirable artifacts in the generated images. Therefore, we propose a finetuning-free framework that trains a guidance network to estimate the conditional expectation of the reward. We achieve comparable performance to finetuning-based models with one-step generation with at least a 60% reduction in computational cost. For the alignment on flow matching, we propose a training-free framework that improves the generation quality without additional computational cost.

artificial intelligence, diffusion model, machine learning, (18 more...)

arXiv.org Machine Learning

Feb-3-2026

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - California > Los Angeles County
    - Los Angeles (0.14)
- Asia > China
  - Hong Kong (0.04)
  - Guangdong Province > Shenzhen (0.04)

Genre:
- Research Report (1.00)

Industry:
- Media > Photography (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found