Compressing Transformer-based self-supervised models for speech processing

Lin, Tzu-Quan, Yang, Tsung-Huan, Chang, Chun-Yao, Chen, Kuang-Ming, Feng, Tzu-hsun, Lee, Hung-yi, Tang, Hao

Nov-17-2022–arXiv.org Artificial Intelligence

Despite the success of Transformers in self-supervised learning with applications to various downstream tasks, the computational cost of training and inference remains a major challenge for applying these models to a wide spectrum of devices. Several isolated attempts have been made to compress Transformers, prior to applying them to downstream tasks. In this work, we aim to provide context for the isolated results, studying several commonly used compression techniques, including weight pruning, head pruning, low-rank approximation, and knowledge distillation. We report wall-clock time, the number of parameters, and the number of multiply-accumulate operations for these techniques, charting the landscape of compressing Transformer-based self-supervised models.

artificial intelligence, machine learning, pruning, (15 more...)

arXiv.org Artificial Intelligence

Nov-17-2022

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found