Improving Isochronous Machine Translation with Target Factors and Auxiliary Counters

Pal, Proyag, Thompson, Brian, Virkar, Yogesh, Mathur, Prashant, Chronopoulou, Alexandra, Federico, Marcello

May-22-2023–arXiv.org Artificial Intelligence

To translate speech for automatic dubbing, machine translation needs to be isochronous, i.e. translated speech needs to be aligned with the source in terms of speech durations. We introduce target factors in a transformer model to predict durations jointly with target language phoneme sequences. We also introduce auxiliary counters to help the decoder to keep track of the timing information while generating target phonemes. We show that our model improves translation quality and isochrony compared to previous work where the translation model is instead trained to predict interleaved sequences of phonemes and durations.

artificial intelligence, natural language, target factor, (14 more...)

arXiv.org Artificial Intelligence

May-22-2023

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America > United States (0.68)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found