Fully Quantized Transformer for Improved Translation

Prato, Gabriele, Charlaix, Ella, Rezagholizadeh, Mehdi

Oct-16-2019–arXiv.org Machine Learning

A BSTRACT State-of-the-art neural machine translation methods employ massive amounts of parameters. Drastically reducing computational costs of such methods without affecting performance has been up to this point unsolved. In this work, we propose a quantization strategy tailored to the Transformer (V aswani et al., 2017) architecture. We evaluate our method on the WMT14 EN-FR and WMT14 EN-DE translation tasks and achieve state-of-the-art quantization results for the Transformer, obtaining no loss in BLEU scores compared to the non-quantized baseline. We further compress the Transformer by showing that, once the model is trained, a good portion of the nodes in the encoder can be removed without causing any loss in BLEU. 1 I NTRODUCTION Neural machine translation methods have achieved impressive results lately (Ahmed et al., 2017; Ott et al., 2018; Edunov et al., 2018). Having been proposed only recently (Kalchbrenner & Blunsom, 2013; Sutskever et al., 2014; Cho et al., 2014), many great work have led the field to move forward quickly. Bahdanau et al. (2014) introduced an attention mechanism, allowing the decoder to attend to any hidden state generated by the encoder. Multiple improvements to their approach have been proposed, such as multiplicative attention (Luong et al., 2015) and more recently multi-head self-attention (V aswani et al., 2017).

arxiv e-print, quantization, transformer, (14 more...)

arXiv.org Machine Learning

Oct-16-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Quebec (0.04)
  - United States
    - Washington > King County
      - Seattle (0.04)
    - New Jersey > Middlesex County
      - Piscataway (0.04)
    - California > Santa Clara County
      - Stanford (0.04)
      - Palo Alto (0.04)
- Asia > Middle East
  - Jordan (0.04)
  - Qatar > Ad-Dawhah
    - Doha (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found