Metric-oriented Speech Enhancement using Diffusion Probabilistic Model

Chen, Chen, Hu, Yuchen, Weng, Weiwei, Chng, Eng Siong

Feb-23-2023–arXiv.org Artificial Intelligence

Deep neural network based speech enhancement technique focuses on learning a noisy-to-clean transformation supervised by paired training data. However, the task-specific evaluation metric (e.g., PESQ) is usually non-differentiable and can not be directly constructed in the training criteria. This mismatch between the training objective and evaluation metric likely results in sub-optimal performance. To alleviate it, we propose a metric-oriented speech enhancement method (MOSE), which leverages the recent advances in the diffusion probabilistic model and integrates a metric-oriented training strategy into its reverse process. Specifically, we design an actor-critic based framework that considers the evaluation metric as a posterior reward, thus guiding the reverse process to the metric-increasing direction. The experimental results demonstrate that MOSE obviously benefits from metric-oriented training and surpasses the generative baselines in terms of all evaluation metrics.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Feb-23-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)
- Asia > Singapore (0.04)
- Europe > Italy
  - Calabria > Catanzaro Province > Catanzaro (0.04)

Genre:
- Research Report (0.84)

Industry:
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found