Fine-Tuning Text-to-Speech Diffusion Models Using Reinforcement Learning with Human Feedback

Open in new window