Review for NeurIPS paper: HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

May-31-2025, 19:13:40 GMT–Neural Information Processing Systems

This work initially received mixed reviews, but after the author feedback cleared up a misunderstanding, most reviewers are now recommending acceptance. Nevertheless, I think R2 (who has not raised their score) has some valid concerns, which I want to account for in my decision. I have decided to recommend acceptance. The experimental section of this work is fairly comprehensive, and adequately demonstrates that the proposed architecture is effective. However, it is important to point out that the majority of experiments was conducted using ground-truth mel-spectrogram conditioning, which does not match the usual practical setting of TTS systems, where the spectrograms are themselves generated by a model (and thus imperfect).

author feedback, generative adversarial network, high fidelity speech synthesis, (8 more...)

Neural Information Processing Systems

May-31-2025, 19:13:40 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Synthesis (0.40)
  - Machine Learning
    - Unsupervised or Indirectly Supervised Learning (0.40)
    - Neural Networks (0.40)