MLLP-VRAIN UPV system for the IWSLT 2025 Simultaneous Speech Translation Translation task

Iranzo-Sánchez, Jorge, Iranzo-Sánchez, Javier, Giménez, Adrià, Civera, Jorge, Juan, Alfons

Jun-24-2025–arXiv.org Artificial Intelligence

This work describes the participation of the MLLP-VRAIN research group in the shared task of the IWSLT 2025 Simultaneous Speech Translation track. Our submission addresses the unique challenges of real-time translation of long-form speech by developing a modular cascade system that adapts strong pre-trained models to streaming scenarios. We combine Whisper Large-V3-Turbo for ASR with the multilingual NLLB-3.3B model for MT, implementing lightweight adaptation techniques rather than training new end-to-end models from scratch. Our approach employs document-level adaptation with prefix training to enhance the MT model's ability to handle incomplete inputs, while incorporating adaptive emission policies including a wait-$k$ strategy and RALCP for managing the translation stream. Specialized buffer management techniques and segmentation strategies ensure coherent translations across long audio sequences. Experimental results on the ACL60/60 dataset demonstrate that our system achieves a favorable balance between translation quality and latency, with a BLEU score of 31.96 and non-computational-aware StreamLAAL latency of 2.94 seconds. Our final model achieves a preliminary score on the official test set (IWSLT25Instruct) of 29.8 BLEU. Our work demonstrates that carefully adapted pre-trained components can create effective simultaneous translation systems for long-form content without requiring extensive in-domain parallel data or specialized end-to-end training.

artificial intelligence, computational linguistic, natural language, (16 more...)

arXiv.org Artificial Intelligence

Jun-24-2025

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- Asia (1.00)
- North America > United States
  - Pennsylvania (0.28)

Genre:
- Research Report (0.50)

Industry:
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)