Diffsound: Discrete Diffusion Model for Text-to-sound Generation