Addressing Negative Transfer in Diffusion Models

Neural Information Processing Systems 

To achieve this, we propose leveraging existing MTL methods, but the presence of a huge number of denoising tasks makes this computationally expensive to calculate the necessary per-task loss or gradient.