Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models

Open in new window