Goto

Collaborating Authors

 li-gru


Stabilising and accelerating light gated recurrent units for automatic speech recognition

arXiv.org Artificial Intelligence

Hence, the choice of the recurrent unit is of crucial interest to achieve state-of-the-art word error rates. For instance, the The light gated recurrent units (Li-GRU) is well-known for achieving light gated recurrent units (Li-GRU) [8] network has been designed impressive results in automatic speech recognition (ASR) tasks to carefully address the task of ASR. A Li-GRU is a compact singlegate while being lighter and faster to train than a standard gated recurrent unit derived from the gated recurrent units (GRU) which reduce units (GRU). However, the unbounded nature of its rectified linear by30% the per-epoch training time over a standard GRU while also unit on the candidate recurrent gate induces an important gradient improving the ASR accuracy. Nevertheless, and despite a clear interest exploding phenomenon disrupting the training process and preventing from the community, two major issues prevent a stronger adoption it from being applied to famous datasets. In this paper, we theoretically of the Li-GRU: (1) it highly suffers from exploding gradients and empirically derive the necessary conditions for its stability as the gate is unbounded; and (2) no optimized implementation exists, as well as engineering mechanisms to speed up by a factor of hence leading to much larger training times than more complex five its training time, hence introducing a novel version of this architecture alternatives such as LSTM neural networks.