Reviews: Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion
–Neural Information Processing Systems
Using the invertibility(forward-backward) of flow-based model to do voice conversion is an overall clever idea. The novelty in machine learning/deep learning is limited. The session 4 is more like an architecture tuning summary of Glow/WaveGlow. Also the quality of posted audio samples and subjective evaluation(both naturalness and similarity) need to be improved. So there is no information bottleneck like auto-encoder based models(e.g.
flow-based model, non-parallel raw-audio voice conversion, single-scale hyperconditioned flow, (4 more...)
Neural Information Processing Systems
Jan-25-2025, 18:46:30 GMT
- Technology: