Teochew-Wild: The First In-the-wild Teochew Dataset with Orthographic Annotations

Pan, Linrong, Jiang, Chenglong, Hou, Gaoze, Gao, Ying

arXiv.org Artificial Intelligence 

It encompasses both formal written language and a large number of words commonly used in daily life, demonstrating significant diversity. V. EXPERIMENTS In this section, we conduct TTS and ASR experiments on our Teochew-Wild dataset to validate its effectiveness. Given that state-of-the-art TTS and ASR models typically require thousands of hours of training data to converge, we selected models that are suitable for smaller datasets for verification. Specifically, in the TTS experiment, we used the autoregressive (AR) model Tacotron2 [27] and the non-autoregressive (NAR) model FastSpeech2 [28] to predict mel-spectrograms, with the HiFi-GAN [29] vocoder used to convert them into waveforms. In the ASR experiment, we trained the Fairseq S2T Transformer XS [30] with both character-based and pinyin-based annotations.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found