Highway Long Short-Term Memory RNNs for Distant Speech Recognition
Zhang, Yu, Chen, Guoguo, Yu, Dong, Yao, Kaisheng, Khudanpur, Sanjeev, Glass, James
–arXiv.org Artificial Intelligence
ABSTRACT In this paper, we extend the deep long short-term memory (DL-STM) recurrent neural networks by introducing gated direct connections between memory cells in adjacent layers. These direct links, called highway connections, enable unimpeded information flow across different layers and thus alleviate the gradient vanishing problem when building deeper LSTMs. We further introduce the latency-controlled bidirectional LSTMs (BLSTMs) which can exploit the whole history while keeping the latency under control. Efficient algorithms are proposed to train these novel networks using both frame and sequence discriminative criteria. Experiments on the AMI distant speech recognition (DSR) task indicate that we can train deeper LSTMs and achieve better improvement from sequence training with highway LSTMs (HLSTMs). It beats the strong DNN and DLSTM baselines with 15. 7% and 5. 3% relative improvement respectively. Index Terms -- Highway LSTM, CNTK, LSTM, Sequence Training 1. INTRODUCTION Recently the deep neural network (DNN)-based acoustic models (AMs) greatly improved automatic speech recognition (ASR) accuracy on many tasks [1, 2, 3, 4].
arXiv.org Artificial Intelligence
Jan-11-2016
- Country:
- North America > United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Washington > King County
- Seattle (0.04)
- Massachusetts > Middlesex County
- North America > United States
- Genre:
- Research Report > New Finding (0.47)
- Technology: