Unsupervised Polyglot Text To Speech

Nachmani, Eliya, Wolf, Lior

arXiv.org Machine Learning 

ABSTRACT We present a TTS neural network that is able to produce speech in multiple languages. The proposed network is able to transfer a voice, which was presented as a sample in a source language, into one of several target languages. The conversion is based on learning a polyglot network that has multiple perlanguage sub-networksand adding loss terms that preserve the speaker's identity in multiple languages. We evaluate the proposed polyglot neural network for three languages with a total of more than 400 speakers and demonstrate convincing conversion capabilities. Index Terms-- TTS, multilingual, unsupervised learning 1. INTRODUCTION Neural text to speech (TTS) is an emerging technology that is becoming dominant over the alternative TTS technologies, in both quality and flexibility.

Duplicate Docs Excel Report

None found

Similar Docs  Excel Report  more

None found