The Limits of Data Scaling: Sub-token Utilization and Acoustic Saturation in Multilingual ASR

Open in new window