Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition

Wang, Yujin, Tang, Changli, Ma, Ziyang, Zheng, Zhisheng, Chen, Xie, Zhang, Wei-Qiang

Oct-22-2023–arXiv.org Artificial Intelligence

Inspired by compression works [9, 10] on BERT model in NLP domain, there are several previous studies on the distillation Self-supervised learning (SSL) has achieved great success of SSL models in speech domain [11, 12, 13, 14], in speech processing, but always with a large model size to which attempts to reduce the model size for a well-trained increase the modeling capacity. This may limit its potential SSL model in an unsupervised fashion. Most of these existing applications due to the expensive computation and memory works are investigated on the SUPERB benchmark [15], a costs introduced by the oversize model. Compression for SSL generic testing framework for pre-trained models on a range models has become an important research direction of practical of downstream tasks. The SSL models are evaluated in the value. To this end, we explore the effective distillation constrained track, where the whole upstream model is frozen, of HuBERT-based SSL models for automatic speech recognition.

distillation, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Oct-22-2023

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.29)

Genre:
- Research Report (0.64)

Industry:
- Education (0.96)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language (0.94)
  - Speech > Speech Recognition (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found