Cost-effective speech-to-text with weakly- and semi-supervised training

AIHub 

Voice assistants equipped with speech-to-text technology have seen a major boost in performance and usage, thanks to the new powerful machine learning methods based on deep neural networks. These methods follow a supervised learning approach, requiring large amounts of paired speech-text data to train the best performing speech-to-text transcription models. After collecting large amounts of relevant and diverse spoken utterances, the complex and intensive task of annotating and labelling of the collected speech data awaits. To get a feel for a typical scenario, let's look at some estimates. On average a typical user query, for example "Do you have the Christmas edition with Santa?", would last for about 3 seconds.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found