Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation
–arXiv.org Artificial Intelligence
This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2024) for Irish-to-English speech translation. We built end-to-end systems based on Whisper, and employed a number of data augmentation techniques, such as speech back-translation and noise augmentation. We investigate the effect of using synthetic audio data and discuss several methods for enriching signal diversity.
arXiv.org Artificial Intelligence
Jun-27-2024
- Country:
- North America
- United States
- Massachusetts (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- Canada > Ontario
- Toronto (0.04)
- United States
- Europe
- Switzerland (0.04)
- Germany > Berlin (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Taiwan > Taiwan Province
- Taipei (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- India > Bihar
- Patna (0.04)
- Taiwan > Taiwan Province
- North America
- Genre:
- Research Report (0.64)
- Technology: