Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models

Mudi, Prasenjit K, Sachan, Anshi, Devapriya, Dahlia, Kalyani, Sheetal

Oct-15-2025–arXiv.org Artificial Intelligence

ABSTRACT Whisper models have achieved remarkable progress in speech recognition; yet their large size remains a bottleneck for deployment on resource-constrained edge devices. This paper proposes a framework to design fine-tuned variants of Whisper which address the above problem. Structured sparsity is enforced via the Sparse Group LASSO penalty as a loss regu-larizer, to reduce the number of FLOating Point operations (FLOPs). Further, a weight statistics aware pruning algorithm is proposed. On Common V oice 11.0 Hindi dataset, we obtain, without degrading WER, (a) 35.4% reduction in model parameters, 14.25% lower memory consumption and 18.5% fewer FLOPs on Whisper-small, and (b) 31% reduction in model parameters, 15.29% lower memory consumption and 16.95% fewer FLOPs on Whisper-medium; and, (c) substantially outperform the state-of-the-art Iterative Magnitude Pruning based method by pruning 18.7% more parameters along with a 12.31 reduction in WER.

artificial intelligence, machine learning, pruning, (18 more...)

arXiv.org Artificial Intelligence

Oct-15-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - India > Karnataka (0.04)
  - Middle East > Jordan (0.04)
- North America > United States
  - Louisiana > Orleans Parish > New Orleans (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.70)
  - Speech > Speech Recognition (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found