Using recurrences in time and frequency within U-net architecture for speech enhancement

Nov-16-2018–arXiv.org Machine Learning

ABSTRACT When designing fully-convolutional neural network, there is a tradeoff between receptive field size, number of parameters and spatial resolution of features in deeper layers of the network. Inthis work we present a novel network design based on combination of many convolutional and recurrent layers that solves these dilemmas. We compare our solution with U-nets based models known from the literature and other baseline modelson speech enhancement task. We test our solution onTIMIT speech utterances combined with noise segments extractedfrom NOISEX-92 database and show clear advantage of proposed solution in terms of SDR (signal-todistortion ratio),SIR (signal-to-interference ratio) and STOI (spectro-temporal objective intelligibility) metrics compared to the current state-of-the-art. Index Terms-- deep learning, speech enhancement, U-nets 1.INTRODUCTION The single-channel speech enhancement problem is to reduce a noise present in a single-channel recording of speech.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

Nov-16-2018

arXiv.org PDF

Add feedback

Country:
- Europe > Poland (0.15)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found