An Empirical Analysis of Speech Self-Supervised Learning at Multiple Resolutions
Clark, Theo, Cevoli, Benedetta, de Jong, Eloy, Abramski, Timofey, Dougherty, Jamie
–arXiv.org Artificial Intelligence
Self-supervised learning (SSL) models have become crucial in speech processing, with recent advancements concentrating on developing architectures that capture representations across multiple timescales. The primary goal of these multi-scale architectures is to exploit the hierarchical nature of speech, where lower-resolution components aim to capture representations that align with increasingly abstract concepts (e.g., from phones to words to sentences). Although multi-scale approaches have demonstrated some improvements over single-scale models, the precise reasons for these enhancements have poor empirical support. In this study, we present an initial analysis of layer-wise representations in multi-scale architectures, with a focus on Canonical Correlation Analysis (CCA) and Mutual Information (MI). We apply this analysis to Multi-Resolution HuBERT (MR-HuBERT) and find that (1) the improved performance on SUPERB tasks is primarily due to the auxiliary low-resolution loss rather than the downsampling itself, and (2) downsampling to lower resolutions neither improves downstream performance nor correlates with higher-level information (e.g., words), though it does improve computational efficiency. These findings challenge assumptions about the multi-scale nature of MR-HuBERT and motivate the importance of disentangling computational efficiency from learning better representations.
arXiv.org Artificial Intelligence
Oct-31-2024
- Country:
- Asia
- China > Hong Kong (0.04)
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- Middle East
- Europe
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Spain > Andalusia
- Granada Province > Granada (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- France > Provence-Alpes-Côte d'Azur
- North America > United States
- California > San Diego County
- San Diego (0.04)
- Washington > King County
- Seattle (0.04)
- California > San Diego County
- South America
- Chile > Santiago Metropolitan Region
- Santiago Province > Santiago (0.04)
- Paraguay > Asunción
- Asunción (0.04)
- Chile > Santiago Metropolitan Region
- Asia
- Genre:
- Research Report > New Finding (0.66)
- Technology: