Text-Driven Foley Sound Generation With Latent Diffusion Model