Using AI to Translate Speech For a Primarily Oral Language
AI-powered speech translation has mainly focused on written languages, yet nearly 3,500 living languages are primarily spoken and don't have a widely used writing system. This makes it impossible to build machine translation tools using standard techniques, which require large amounts of written text in order to train an AI model. To address this challenge, we've built the first AI-powered speech-to-speech translation system for Hokkien, a primarily oral language that's widely spoken within the Chinese diaspora but lacks a standard written form. We're open-sourcing our Hokkien translation models, evaluation datasets and research papers so that others can reproduce and build on our work. The translation system is part of our Universal Speech Translator project, which is developing new AI methods that we hope will eventually allow real-time speech-to-speech translation across many languages.
Oct-21-2022, 11:50:17 GMT
- Technology: