Microsoft Translator publicly releases speech translation corpus
As part of an ongoing effort within Microsoft to improve the accuracy of artificial intelligence (AI) systems, Microsoft Translator is publicly releasing a set of data that includes multiple conversations between bilingual speakers who are speaking French, German and English. This corpus, which was produced by Microsoft using bilingual speakers, aims to create a standard by which people can measure how well their conversational speech translation systems work. It can serve as a standardized data set for testing bilingual conversational speech translation systems such as the Microsoft Translator live feature and Skype Translator. Christian Federmann, a senior program manager working with the Microsoft Translator team, said there aren't as many standardized data sets for testing bilingual conversational speech translation systems. "You need high-quality data in order to have high-quality testing," Federmann said.
Feb-6-2017, 08:30:21 GMT
- Technology: