Iterative Compression of End-to-End ASR Model using AutoML
Mehrotra, Abhinav, Dudziak, Łukasz, Yeo, Jinsu, Lee, Young-yoon, Vipperla, Ravichander, Abdelfattah, Mohamed S., Bhattacharya, Sourav, Ishtiaq, Samin, Ramos, Alberto Gil C. P., Lee, SangJeong, Kim, Daehyun, Lane, Nicholas D.
Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.
Aug-6-2020
- Country:
- Asia > South Korea (0.14)
- Europe
- Germany (0.14)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- North America > United States (0.14)
- Oceania > Australia (0.14)
- Genre:
- Research Report (0.83)
- Workflow (0.68)
- Industry:
- Information Technology > Security & Privacy (0.68)
- Technology: