Goto

Collaborating Authors

 reconstruction result


FreeInv Free Lunch for Improving

Neural Information Processing Systems

Naive DDIM inversion process usually suffers from a trajectory deviation issue, i.e., the latent trajectory during reconstruction deviates from the one during inversion. To alleviate this issue, previous methods either learn to mitigate the deviation or design a cumbersome compensation strategy to reduce the mismatch error, exhibiting substantial time and computation cost. In this work, we present a nearly free-lunch method (named FreeInv) to address the issue more effectively and efficiently. In FreeInv, we randomly transform the latent representation and keep the transformation the same between the corresponding inversion and reconstruction time-step. It is motivated from a statistical perspective that an ensemble of DDIM inversion processes for multiple trajectories yields a smaller trajectory mismatch error on expectation. Moreover, through theoretical analysis and empirical study, we show that FreeInv performs an efficient ensemble of multiple trajectories. FreeInv can be freely integrated into existing inversion-based image and video editing techniques. Especially for inverting video sequences, it brings more significant fidelity and efficiency improvements. Comprehensive quantitative and qualitative evaluation on PIE benchmark and DAVIS dataset shows that FreeInv remarkably outperforms conventional DDIM inversion, and is competitive among previous state-of-the-art inversion methods, with superior computation efficiency.


Precise Diffusion Inversion: Towards Novel Samples and Few-Step Models

Neural Information Processing Systems

The diffusion inversion problem seeks to recover the latent generative trajectory of a diffusion model given a real image. Faithful inversion is critical for ensuring consistency in diffusion-based image editing. Prior works formulate this task as a fixed-point problem and solve it using numerical methods. However, achieving both accuracy and efficiency remains challenging, especially for few-step models and novel samples. In this paper, we propose PreciseInv, a general-purpose testtime optimization framework that enables fast and faithful inversion in as few as two inference steps.



Page 20 of

Neural Information Processing Systems

A.1 Frequency ablation study We perform an ablation study on the coarse-to-fine parameter αd and the number of frequency bands L. In Figure 1, we show the surface reconstruction results of the DTUBuddha model under different frequency parameters. Each model is trained for 300K iterations. In the first row we show the results of surface reconstruction quality under different coarse-to-fine parameters αd. It can be seen that when the parameter is too small, the surface reconstruction tends to be oversmoothed. When the parameter is too large, many artifacts will appear in the reconstruction results.