Scaling Diffusion Transformers Efficiently via \mu P

Open in new window