Ordering Matters: Word Ordering Aware Unsupervised NMT

Banerjee, Tamali, Murthy, Rudra V, Bhattacharyya, Pushpak

Oct-30-2019–arXiv.org Machine Learning

Specifically, given an input sentence of length n, the model applies n/2 random swaps between consecutive words and trains the denoising-based U-NMT model (Artetxe, Labaka, and Agirre 2018). Though effective, applying denoising strategy on every sentence in the training data leads to uncertainty in the model thereby, limiting the benefits from the denoising-based U-NMT model. In this paper, we propose a simple fine-tuning strategy where we fine-tune the trained denoising-based U-NMT system without the de-noising strategy. The input sentences are presented as is i.e., without any shuffling noise added. We observe significant improvements in translation performance on many language pairs from our fine-tuning strategy. Our analysis reveals that our proposed models lead to increase in higher n-gram BLEU score compared to the denoising U-NMT models. 1 Introduction Unsupervised Neural Machine Translation (U-NMT) systems (Lample et al. 2018; Artetxe, Labaka, and Agirre 2018; 2019; Wu, Wang, and Wang 2019) typically train an encoder-decoder model for machine translation task using the monolingual data available in the two languages (l 1, l 2). The model proposed by Artetxe, Labaka, and Agirre 2018 consists of a shared encoder and language specific decoders.

machine translation, source sentence, translation, (16 more...)

arXiv.org Machine Learning

Oct-30-2019

arXiv.org PDF

Add feedback

Country:
- Asia > India (0.04)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Belgium
  - Brussels-Capital Region > Brussels (0.04)

Genre:
- Research Report (0.83)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found