DeepMind Generates High Fidelity Speech With GAN-TTS
GANs have achieved state-of-the-art results in image and video generation, and have been successfully applied for unsupervised feature learning among many other applications. Generative adversarial networks have seen rapid development in recent years, however, their audio generation prowess has largely gone unnoticed. In an attempt to explore the audio generation abilities of GANs, a team of DeepMind researchers published a work where they introduce a new model called GAN-TTS. Text-to-Speech (TTS) is a process for converting text into a humanlike voice output. Many audio generation models operate in the waveform domain.
Oct-15-2019, 17:54:34 GMT
- Technology: