A Recurrent Variational Autoencoder for Speech Enhancement
Leglaive, Simon, Alameda-Pineda, Xavier, Girin, Laurent, Horaud, Radu
–arXiv.org Artificial Intelligence
This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE). The deep generative speech model is trained using clean speech signals only, and it is combined with a nonnegative matrix factorization noise model for speech enhancement. We propose a variational expectation-maximization algorithm where the encoder of the RVAE is fine-tuned at test time, to approximate the distribution of the latent variables given the noisy speech observations. Compared with previous approaches based on feed-forward fully-connected architectures, the proposed recurrent deep generative speech model induces a posterior temporal dynamic over the latent variables, which is shown to improve the speech enhancement results.
arXiv.org Artificial Intelligence
Feb-10-2020
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > France
- Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- Asia > Middle East
- Genre:
- Research Report (0.50)
- Technology: