RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

Boyuan Cao, Jiaxin Ye, Yujie Wei, Hongming Shan

Neural Information Processing Systems 

While latent diffusion models (LDMs), such as Stable Diffusion, are designed for high-resolution (HR) image generation, they often struggle with significant structural one. Instead distortions of relying when generating on extensiv images e retraining, at resolutions a more resource-ef higher than ficient their approach training is to reprogram the pretrained model for HR image generation; however, existing methods often result in poor image quality and long inference time. We introduce RepLDM, high-quality a, no high-ef vel reprogramming ficiency, high-r frame esolution work image for pretrained generation; LDMs see that Fig. enables 1. RepLDM consists of two stages: (i) an attention guidance stage, which generates a latent training-free representa self-attention tion of a higher mechanism -quality to training-resolution enhance the structural image consistenc using a y; no and vel (ii) a progressive upsampling stage, which progressively performs upsampling in pixel space to mitigate the severe artifacts caused by latent space upsampling.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found