12 Best Audio Datasets for Machine Learning Lionbridge AI

#artificialintelligence 

At Lionbridge, we have deep experience helping the world's largest companies teach applications to understand audio. From virtual assistants to in-car navigation, all sound-activated machine learning systems rely on large sets of audio data. This time, we at Lionbridge combed the web and compiled this ultimate cheat sheet for public audio datasets for machine learning. AudioSet: AudioSet is an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos. LibriSpeech: LibriSpeech is a carefully segmented and aligned corpus of approximately 1000 hours of 16kHz read English speech, derived from read audiobooks. Spoken Digit Dataset: This dataset was created to solve the task of identifying spoken digits in audio samples.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found