Self-Supervised Speech Representations are More Phonetic than Semantic

Open in new window