Usage of speaker embeddings for more inclusive speech-to-text
English is one of the most widely used languages worldwide, with approximately 1.2 billion speakers. In order to maximise the performance of speech-to-text systems it is vital to build them in a way that recognises different accents. Recently, spoken dialogue systems have been incorporated into various devices such as smartphones, call services, and navigation systems. These intelligent agents can assist users in performing daily tasks such as booking tickets, setting-up calendar items, or finding restaurants via spoken interaction. They have the potential to be more widely used in a vast range of applications in the future, especially in the education, government, healthcare, and entertainment sectors.
Jun-26-2020, 11:17:52 GMT
- Technology: