Audio Generation with Multiple Conditional Diffusion Model