A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

Neural Information Processing Systems 

Instead of the standard fixed diffusion timestep, we propose applying variable diffusion timesteps across the temporal dimension and across modalities of the inputs.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found