Iterative Compression of End-to-End ASR Model using AutoML

Mehrotra, Abhinav, Dudziak, Łukasz, Yeo, Jinsu, Lee, Young-yoon, Vipperla, Ravichander, Abdelfattah, Mohamed S., Bhattacharya, Sourav, Ishtiaq, Samin, Ramos, Alberto Gil C. P., Lee, SangJeong, Kim, Daehyun, Lane, Nicholas D.

Aug-6-2020–arXiv.org Machine Learning

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.

compression, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

Aug-6-2020

arXiv.org PDF

Add feedback

Country:
- Asia > South Korea (0.14)
- Europe
  - Germany (0.14)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.14)
- North America > United States (0.14)
- Oceania > Australia (0.14)

Genre:
- Research Report (0.83)
- Workflow (0.68)

Industry:
- Information Technology > Security & Privacy (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.73)
  - Natural Language (0.88)
  - Speech (0.87)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found