FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model

Open in new window