Facebook releases low-latency online speech recognition framework

#artificialintelligence 

Facebook AI Research (FAIR) today said it's open-sourcing wav2letter@anywhere, a deep learning-based inference framework that achieves fast performance for online automatic speech recognition in cloud or embedded edge environments. Wav2letter@anywhere is based on neural net-based language models wav2letter and wav2letter, which upon its release in December 2018, FAIR called the fastest open source speech recognition system available. Automatic speech recognition, or ASR, is used to turn audio of spoken words into text, then infer the speaker's intent in order to carry out a task. An API available on GitHub though the wav2letter repository is built to support concurrent audio streams and popular kinds of deep learning speech recognition models like convolutional neural networks (CNN) or recurrent neural networks (RNN) in order to deliver scale necessary for online ASR. Wav2letter@anywhere achieves better word error rate performance than two baseline models made from bidirectional LSTM RNNs, according to a paper released last week by eight FAIR researchers from labs in New York City and at company headquarters in Menlo Park.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found