A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation

Liu, Xiaoqian, Du, Yangfan, Wang, Jianjin, Ge, Yuan, Xu, Chen, Xiao, Tong, Chen, Guocheng, Zhu, Jingbo

Dec-30-2024–arXiv.org Artificial Intelligence

Simultaneous Speech Translation (SimulST) involves generating target language text while continuously processing streaming speech input, presenting significant real-time challenges. Multi-task learning is often employed to enhance SimulST performance but introduces optimization conflicts between primary and auxiliary tasks, potentially compromising overall efficiency. The existing model-level conflict resolution methods are not well-suited for this task which exacerbates inefficiencies and leads to high GPU memory consumption. To address these challenges, we propose a Modular Gradient Conflict Mitigation (MGCM) strategy that detects conflicts at a finer-grained modular level and resolves them utilizing gradient projection. Experimental results demonstrate that MGCM significantly improves SimulST performance, particularly under medium and high latency conditions, achieving a 0.68 BLEU score gain in offline tasks. Additionally, MGCM reduces GPU memory consumption by over 95\% compared to other conflict mitigation methods, establishing it as a robust solution for SimulST tasks.

artificial intelligence, mitigating gradient conflict, simultaneous speech translation, (1 more...)

arXiv.org Artificial Intelligence

Dec-30-2024

arXiv.org Web Page

Add feedback

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (0.60)
  - Speech > Speech Recognition (0.60)