Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices

Mar-16-2026, 20:25:58 GMT–Neural Information Processing Systems

Real-time automatic speech recognition (ASR) on mobile and embedded devices has been of great interests for many years. We present real-time speech recognition on smartphones or embedded systems by employing recurrent neural network (RNN) based acoustic models, RNN based language models, and beam-search decoding. The acoustic model is end-to-end trained with connectionist temporal classification (CTC) loss. The RNN implementation on embedded devices can suffer from excessive DRAM accesses because the parameter size of a neural network usually exceeds that of the cache memory and the parameters are used only once for each time step. To remedy this problem, we employ a multi-time step parallelization approach that computes multiple output samples at a time with the parameters fetched from the DRAM.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Mar-16-2026, 20:25:58 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (0.84)
  - Machine Learning > Neural Networks
    - Deep Learning (0.58)