Helping the Weak Makes You Strong: Simple Multi-Task Learning Improves Non-Autoregressive Translators

Wang, Xinyou, Zheng, Zaixiang, Huang, Shujian

Nov-11-2022–arXiv.org Artificial Intelligence

Recently, non-autoregressive (NAR) neural machine translation models have received increasing attention due to their efficient parallel decoding. However, the probabilistic framework of NAR models necessitates conditional independence assumption on target sequences, falling short of characterizing human language data. This drawback results in less informative learning signals for NAR models under conventional MLE training, thereby yielding unsatisfactory accuracy compared to their autoregressive (AR) counterparts. In this paper, we propose a simple and model-agnostic multi-task learning framework to provide more informative learning signals. During training stage, we introduce a set of sufficiently weak AR decoders that solely rely on the information provided by NAR decoder to make prediction, forcing the NAR decoder to become stronger or else it will be unable to support its weak AR partners. Experiments on WMT and IWSLT datasets show that our approach can consistently improve accuracy of multiple NAR baselines without adding any additional decoding overhead.

decoder, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Nov-11-2022

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Pennsylvania (0.04)
    - New York > New York County
      - New York City (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - California
      - San Diego County > San Diego (0.04)
      - Los Angeles County > Long Beach (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia > China
  - Hong Kong (0.04)
  - Jiangsu Province > Nanjing (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found