Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation

Huang, Wuwei, Jin, Renren, Zhang, Wen, Luan, Jian, Wang, Bin, Xiong, Deyi

Mar-14-2025–arXiv.org Artificial Intelligence

Recent studies on end-to-end speech translation(ST) have facilitated the exploration of multilingual end-to-end ST and end-to-end simultaneous ST. In this paper, we investigate end-to-end simultaneous speech translation in a one-to-many multilingual setting which is closer to applications in real scenarios. We explore a separate decoder architecture and a unified architecture for joint synchronous training in this scenario. To further explore knowledge transfer across languages, we propose an asynchronous training strategy on the proposed unified decoder architecture. A multi-way aligned multilingual end-to-end ST dataset was curated as a benchmark testbed to evaluate our methods. Experimental results demonstrate the effectiveness of our models on the collected dataset. Our codes and data are available at: https://github.com/XiaoMi/TED-MMST.

speech translation, target language, translation, (13 more...)

arXiv.org Artificial Intelligence

Mar-14-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - Singapore (0.05)
  - China
    - Tianjin Province > Tianjin (0.05)
    - Yunnan Province (0.04)
    - Beijing > Beijing (0.04)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language > Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found