Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation

Jun-27-2024–arXiv.org Artificial Intelligence

This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2024) for Irish-to-English speech translation. We built end-to-end systems based on Whisper, and employed a number of data augmentation techniques, such as speech back-translation and noise augmentation. We investigate the effect of using synthetic audio data and discuss several methods for enriching signal diversity.

dataset, proceedings, translation, (12 more...)

arXiv.org Artificial Intelligence

Jun-27-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Massachusetts (0.04)
    - Pennsylvania > Philadelphia County
      - Philadelphia (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Switzerland (0.04)
  - Germany > Berlin (0.04)
  - Spain > Valencian Community
    - Valencia Province > Valencia (0.04)
  - Middle East > Republic of Türkiye
    - Istanbul Province > Istanbul (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Taiwan > Taiwan Province
    - Taipei (0.04)
  - Middle East > Republic of Türkiye
    - Istanbul Province > Istanbul (0.04)
  - India > Bihar
    - Patna (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Speech (1.00)
  - Natural Language > Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found