Goto

Collaborating Authors

 gradspo


AGradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models

Neural Information Processing Systems

Direct Preference Optimization (DPO) is a key framework for aligning text-to-image models with human preferences, extended by Stepwise Preference Optimization (SPO) to leverage intermediate steps for preference learning, generating more aesthetically pleasing images with significantly less computational cost.