From Discrete T okens to High-Fidelity Audio Using Multi-Band Diffusion Robin San Roman