New Datasets to Democratize Speech Recognition Technology

Jan-15-2022, 07:10:28 GMT–#artificialintelligence

The next wave of AI will be powered by the democratization of data. Open-source frameworks such as TensorFlow and Pytorch have brought machine learning to a huge developer base, but most state-of-the-art models still rely on training datasets which are either wholly proprietary or prohibitively expensive to license [1]. As a result, the best automated speech recognition (ASR) models for converting speech audio into text are only available commercially, and are trained on data unavailable to the general public. Furthermore, only widely-spoken languages receive industry attention due to market incentives, limiting the availability of cutting-edge speech technology to English and a handful of other languages. The first is prohibitive licensing: Several free datasets do exist, but most of sufficient size and quality to make models truly shine are barred from commercial use. As a response, we created The People's Speech, a massive English-language dataset of audio transcriptions of full sentences (see Sample 1).

accessed, dataset, keyword, (11 more...)

#artificialintelligence

Jan-15-2022, 07:10:28 GMT

News Web Page

Add feedback

Country:
- Africa > East Africa (0.04)
- Oceania > Australia
  - Queensland > Brisbane (0.04)
- North America
  - United States > Texas
    - Dallas County > Dallas (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - Alberta > Census Division No. 11
      - Edmonton Metropolitan Region > Edmonton (0.04)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Information Technology (0.94)
- Health & Medicine (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Machine Learning > Neural Networks (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found