92d1e1eb1cd6f9fba3227870bb6d7f07-AuthorFeedback.pdf

Jan-26-2025, 14:53:48 GMT–Neural Information Processing Systems

We thank the reviewers for their fruitful comments! Response to Reviewer 2: We predict characters for Librispeech/Libri-light. Thank you for the pointer! "when the official LibriSpeech LM... is incorporated into decoding, it is not clear whether the experiments still represent We will also try to make it more self-contained given the space restrictions. "I'm not convinced that this training works well conceptually." "... for ASR, we have a lot of transcribed data, and we can make a strong ASR model and perform transfer learning." "... how to extract K detractors." - The distractors are quantized latent speech representations sampled from masked If another masked time-step uses the same quantized latent, then it won't be sampled. "The paper would have been significantly different in terms of quality had you applied you approach to some standard This follows other recent work on semi-supervised methods for speech such as "Improved Noisy Student Training Synnaeve et al., 2020" which achieve some of the strongest results.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Jan-26-2025, 14:53:48 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.32)