Segmentation-Free Streaming Machine Translation
Iranzo-Sánchez, Javier, Iranzo-Sánchez, Jorge, Giménez, Adrià, Civera, Jorge, Juan, Alfons
–arXiv.org Artificial Intelligence
Streaming Machine Translation (MT) is the task of translating an unbounded input text stream in real-time. The traditional cascade approach, which combines an Automatic Speech Recognition (ASR) and an MT system, relies on an intermediate segmentation step which splits the transcription stream into sentence-like units. However, the incorporation of a hard segmentation constrains the MT system and is a source of errors. This paper proposes a Segmentation-Free framework that enables the model to translate an unsegmented source stream by delaying the segmentation decision until the translation has been generated. Extensive experiments show how the proposed Segmentation-Free framework has better quality-latency trade-off than competing approaches that use an independent segmentation model. Software, data and models will be released upon paper acceptance.
arXiv.org Artificial Intelligence
Sep-26-2023
- Country:
- Asia
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Tuscany
- Florence (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Valencian Community > Alicante Province
- Alicante (0.04)
- Catalonia > Barcelona Province
- Sweden > Stockholm
- Stockholm (0.04)
- Belgium > Brussels-Capital Region
- North America
- Dominican Republic (0.04)
- United States > Pennsylvania
- Allegheny County > Pittsburgh (0.04)
- Philadelphia County > Philadelphia (0.04)
- Oceania > Australia
- Queensland > Brisbane (0.04)
- Genre:
- Research Report (1.00)
- Technology: