MultimodalandMultilingualEmbeddings forLarge-ScaleSpeechMining
–Neural Information Processing Systems
Using a similarity metric in that multimodal embedding space, we perform mining of audio in German, French, Spanish and English from Librivox against billions of sentences from CommonCrawl.
Neural Information Processing Systems
Feb-9-2026, 15:35:56 GMT
- Country:
- Genre:
- Research Report > New Finding (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Natural Language > Machine Translation (0.48)
- Speech > Speech Recognition (0.49)
- Information Technology > Artificial Intelligence