MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models