End-to-end Speech Recognition with Word-based RNN Language Models
Hori, Takaaki, Cho, Jaejin, Watanabe, Shinji
–arXiv.org Artificial Intelligence
ABSTRACT This paper investigates the impact of word-based RNN language models (RNN-LMs) on the performance of end-to-end automatic speech recognition (ASR). In our prior work, we have proposed a multilevel LM, in which character-based and word-based RNN-LMs are combined in hybrid CTC/attention-based ASR. Although this multilevel approach achieves significant error reduction in the Wall Street Journal (WSJ) task, two different LMs need to be trained and used for decoding, which increase the computational cost and memory usage. In this paper, we further propose a novel wordbased RNN-LM, which allows us to decode with only the wordbased LM, where it provides look-ahead word probabilities to predict next characters instead of the character-based LM, leading competitive accuracy with less computation compared to the multilevel LM. We demonstrate the efficacy of the word-based RNN-LMs using a larger corpus, LibriSpeech, in addition to WSJ we used in the prior work. Furthermore, we show that the proposed model achieves 5.1 %WER for WSJ Eval'92 test set when the vocabulary size is increased, which is the best WER reported for end-to-end ASR systems on this benchmark. Index Terms-- End-to-end speech recognition, language modeling, decoding, connectionist temporal classification, attention decoder 1. INTRODUCTION Automatic speech recognition (ASR) is currently a mature set of widely-deployed technologies that enable successful user interface applications such as voice search [1]. However, current systems lean heavily on the scaffolding of complicated legacy architectures that grew up around traditional techniques, including hidden Markov models (HMMs), Gaussian mixture models (GMMs), hybrid HMM/deep neural network (DNN) systems, and sequence discriminative training methods [2].
arXiv.org Artificial Intelligence
Aug-7-2018
- Country:
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Genre:
- Research Report (1.00)
- Technology: