Analyzing the relationships between pretraining language, phonetic, tonal, and speaker information in self-supervised speech models

Open in new window