IBM's AI generates high-quality voices from 5 minutes of talking

Oct-1-2019, 04:51:12 GMT–#artificialintelligence

Training powerful text to speech models requires sufficiently powerful hardware. A recent study published by OpenAI drives the point home -- it found that since 2012, the amount of compute used in the largest runs grew by more than 300,000 times. In pursuit of less demanding models, researchers at IBM developed a new lightweight and modular method for speech synthesis. They say it's able to synthesize high-quality speech in real time by learning different aspects of a speaker's voice, making it possible to adapt to new speaking styles and voices with small amounts of data. "Recent advances in deep learning are dramatically improving the development of Text-to-Speech (TTS) systems through more effective and efficient learning of voice and speaking styles of speakers and more natural generation of high-quality output speech," wrote IBM researchers Zvi Kons, Slava Shechtman, and Alex Sorin in a blog post accompanying a preprint paper presented at Interspeech 2019.

ai generate high-quality voice, high-quality speech, speech synthesis, (7 more...)

#artificialintelligence

Oct-1-2019, 04:51:12 GMT

News Web Page

Add feedback

Genre:
- Research Report > New Finding (0.58)

Industry:
- Information Technology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (0.97)
  - Speech > Speech Synthesis (0.85)
  - Machine Learning > Neural Networks
    - Deep Learning (0.58)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found