Self-supervised learning in Audio and Speech
The ongoing success of deep learning techniques depends on the quality of the representations automatically discovered from data 1. These representations must capture important underlying structures from the raw input, e.g., intermediate concepts, features, or latent variables that are useful for the downstream task. While supervised learning using large annotated corpora can leverage useful representations, collecting large amounts of annotated examples is costly, time-consuming, and not always feasible. This is particularly problematic for a large variety of applications. In the speech domain, for instance, there are many low-resource languages, where the progress is dramatically slower than in high-resource languages such as English.
Jul-2-2020, 12:25:56 GMT
- Country:
- South America > Chile
- North America
- United States > New Mexico (0.04)
- Canada > Quebec
- Montreal (0.06)
- Europe
- France (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.14)
- Asia > Vietnam
- Industry:
- Health & Medicine (0.31)
- Technology: