Segmentation-Free Streaming Machine Translation

Iranzo-Sánchez, Javier, Iranzo-Sánchez, Jorge, Giménez, Adrià, Civera, Jorge, Juan, Alfons

Sep-26-2023–arXiv.org Artificial Intelligence

Streaming Machine Translation (MT) is the task of translating an unbounded input text stream in real-time. The traditional cascade approach, which combines an Automatic Speech Recognition (ASR) and an MT system, relies on an intermediate segmentation step which splits the transcription stream into sentence-like units. However, the incorporation of a hard segmentation constrains the MT system and is a source of errors. This paper proposes a Segmentation-Free framework that enables the model to translate an unsegmented source stream by delaying the segmentation decision until the translation has been generated. Extensive experiments show how the proposed Segmentation-Free framework has better quality-latency trade-off than competing approaches that use an independent segmentation model. Software, data and models will be released upon paper acceptance.

computational linguistic, proceedings, translation, (14 more...)

arXiv.org Artificial Intelligence

Sep-26-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Queensland > Brisbane (0.04)
- North America
  - Dominican Republic (0.04)
  - United States > Pennsylvania
    - Philadelphia County > Philadelphia (0.04)
    - Allegheny County > Pittsburgh (0.04)
- Europe
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Spain
    - Valencian Community > Alicante Province
      - Alicante (0.04)
    - Catalonia > Barcelona Province
      - Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Vietnam > Da Nang
    - Da Nang (0.04)
  - Thailand > Phuket
    - Phuket (0.04)
  - Japan > Honshū
    - Kantō > Kanagawa Prefecture > Yokohama (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found