Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

Dighe, Pranay, Luyet, Gil, Asaei, Afsaneh, Bourlard, Herve

arXiv.org Machine Learning 

Two major emerging trends, namely deep neural networks (DNN) and exemplar-based sparse modeling, are different approaches of exploiting sparsity in speech representations to achieve invariance, discrimination and noise separation [5, 4, 6]. On the other hand, speech utterances are formed as a union of words which in turn consist of phonetic components and subphonetic attributes. Each linguistic component is produced through activation of a few highly constrained articulatory mechanisms leading to generation of speech data in union of low-dimensional subspaces [7, 8, 9]. However, most existing speech classification and acoustic modeling methods do not explicitly take into account the multi-subspace structure of the data. The present study focuses on exploiting the multi-subspace lowdimensional structure of speech learned from the training data to enhance DNN based acoustic modeling of unseen test data. Hence, this also has the potential to enable domain adaptation and handling mismatch in the framework of DNN based acoustic modeling.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found