6804c9bca0a615bdb9374d00a9fcba59-AuthorFeedback.pdf

Neural Information Processing Systems 

We believe that these important contributions warrant publication in the6 conference. State-of-the-art claims for text-to-speech: Furthermore, we will remove the state-of-the-art TTS claim made in17 line 87 in the final version. The MOS of ground truth audio in this dataset is 4.72. R1: claiming "autoregressive models can be readily replaced with MelGAN decoder" (line 89, line 228) without32 For the sake of brevity, the results are as follows:44 Original(4.19 0.083),MelGAN(3.49 Yes, the exact same hardware and computing specifications were used tocompare all the52 models.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found