Stabilising and accelerating light gated recurrent units for automatic speech recognition
Moumen, Adel, Parcollet, Titouan
–arXiv.org Artificial Intelligence
Hence, the choice of the recurrent unit is of crucial interest to achieve state-of-the-art word error rates. For instance, the The light gated recurrent units (Li-GRU) is well-known for achieving light gated recurrent units (Li-GRU) [8] network has been designed impressive results in automatic speech recognition (ASR) tasks to carefully address the task of ASR. A Li-GRU is a compact singlegate while being lighter and faster to train than a standard gated recurrent unit derived from the gated recurrent units (GRU) which reduce units (GRU). However, the unbounded nature of its rectified linear by30% the per-epoch training time over a standard GRU while also unit on the candidate recurrent gate induces an important gradient improving the ASR accuracy. Nevertheless, and despite a clear interest exploding phenomenon disrupting the training process and preventing from the community, two major issues prevent a stronger adoption it from being applied to famous datasets. In this paper, we theoretically of the Li-GRU: (1) it highly suffers from exploding gradients and empirically derive the necessary conditions for its stability as the gate is unbounded; and (2) no optimized implementation exists, as well as engineering mechanisms to speed up by a factor of hence leading to much larger training times than more complex five its training time, hence introducing a novel version of this architecture alternatives such as LSTM neural networks.
arXiv.org Artificial Intelligence
Feb-16-2023
- Country:
- South America > Chile
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- United Kingdom > England
- Genre:
- Research Report (0.40)
- Technology: