Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech