Efficient Scaling of Diffusion Transformers for Text-to-Image Generation