Semi-Autoregressive Training Improves Mask-Predict Decoding

Ghazvininejad, Marjan, Levy, Omer, Zettlemoyer, Luke

Jan-23-2020–arXiv.org Machine Learning

The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach. We introduce a new training method for conditional masked language models, SMART, which mimics the semi-autoregressive behavior of mask-predict, producing training examples that contain model predictions as part of their inputs. Models trained with SMART produce higher-quality translations when using mask-predict decoding, effectively closing the remaining performance gap with fully autoregressive models.

iteration, training example, translation, (16 more...)

arXiv.org Machine Learning

Jan-23-2020

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Industry:
- Materials > Metals & Mining > Gold (0.30)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Inductive Learning (0.64)
  - Natural Language > Machine Translation (0.53)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found