Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation