Unsupervised speech enhancement with deep dynamical generative speech and noise models

Lin, Xiaoyu, Leglaive, Simon, Girin, Laurent, Alameda-Pineda, Xavier

Jun-13-2023–arXiv.org Artificial Intelligence

ND methods use noise or noisy speech enhancement using a dynamical variational autoencoder speech training samples to learn some noise characteristics. In (DVAE) as the clean speech model and non-negative matrix factorization contrast, NA methods only use clean speech signals for training (NMF) as the noise model. We propose to replace and the noise characteristics are estimated at test time for the NMF noise model with a deep dynamical generative model each noisy speech sequence to process. A typical unsupervised (DDGM) depending either on the DVAE latent variables, or on NA approach uses a pre-trained variational autoencoder (VAE) the noisy observations, or on both. This DDGM can be trained as a prior distribution of the clean speech signal and a nonnegative in three configurations: noise-agnostic, noise-dependent and matrix factorization (NMF) model for the noise variance noise adaptation after noise-dependent training.

artificial intelligence, configuration, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Jun-13-2023

arXiv.org PDF

Add feedback

Country:
- Asia (0.68)
- Europe (1.00)
- North America
  - Canada (0.28)
  - United States > California (0.14)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found