end-to-end speech recognition model
How to Build Your Own End-to-End Speech Recognition Model in PyTorch
Deep Learning has changed the game in speech recognition with the introduction of end-to-end models. Deep Learning has changed the game in speech recognition with the introduction of end-to-end models. These models take in audio, and directly output transcriptions. Two of the most popular end-to-end models today are Deep Speech by Baidu, and Listen Attend Spell (LAS) by Google. Both Deep Speech and LAS, are recurrent neural network (RNN) based architectures with different approaches to modeling speech recognition.