DeepSpeech for Dummies - A Tutorial and Overview

Nov-2-2022, 14:30:39 GMT–#artificialintelligence

DeepSpeech is a neural network architecture first published by a research team at Baidu. In 2017, Mozilla created an open source implementation of this paper - dubbed "Mozilla DeepSpeech". The original DeepSpeech paper from Baidu popularized the concept of "end-to-end" speech recognition models. "End-to-end" means that the model takes in audio, and directly outputs characters or words. This is compared to traditional speech recognition models, like those built with popular open source libraries such as Kaldi or CMU Sphinx, that predict phonemes, and then convert those phonemes to words in a later, downstream process. The goal of "end-to-end" models, like DeepSpeech, was to simplify the speech recognition pipeline into a single model. In addition, the theory introduced by the Baidu research paper was that training large deep learning models, on large amounts of data, would yield better performance than classical speech recognition models.

buffer, deepspeech, sample rate, (12 more...)

#artificialintelligence

Nov-2-2022, 14:30:39 GMT

News Web Page

Add feedback

Genre:
- Instructional Material > Course Syllabus & Notes (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.54)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found