Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23

Tsiamas, Ioannis, Gállego, Gerard I., Fonollosa, José A. R., Costa-jussà, Marta R.

Jun-2-2023–arXiv.org Artificial Intelligence

Gállego et al. (2021); Zhao et al. (2022) aimed to Han et al. (2021) tackled the issue by projecting speech and text features In the past decade, the field of Speech Translation (ST) has seen significant advancements, mainly In our work, we tackle the issue of misaligned due to end-to-end models that directly translate speech and text encoder representations by adopting speech, offering a more efficient method compared the approach proposed by Le et al. (2023). Despite data availability challenges, recent on English ASR, wav2vec 2.0 (Baevski et al., progress has diminished the performance disparity 2020), and an MT foundation model fine-tuned between these approaches (Bentivogli et al., 2021; on multilingual MT (En-Xx), mBART50 (Tang Potapczyk and Przybysz, 2020; Inaguma et al., et al., 2020), as described in Section 2.1.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jun-2-2023

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America
  - United States
    - Pennsylvania (0.04)
    - New York > New York County
      - New York City (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.04)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
  - Canada > Alberta
    - Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
- Europe
  - Spain (0.04)
  - Belgium (0.04)
  - Romania > Sud - Muntenia Development Region
    - Giurgiu County > Giurgiu (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Vietnam > Thái Bình Province
    - Thái Bình (0.04)
  - Thailand
    - Bangkok > Bangkok (0.04)
    - Phuket > Phuket (0.04)

Genre:
- Research Report
  - Experimental Study (0.46)
  - New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language > Machine Translation (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found