An Efficient Transformer Decoder with Compressed Sub-layers

Li, Yanyang, Lin, Ye, Xiao, Tong, Zhu, Jingbo

May-11-2023–arXiv.org Artificial Intelligence

The large attention-based encoder-decoder network (Transformer) has become prevailing recently due to its effectiveness. But the high computation complexity of its decoder raises the inefficiency issue. By examining the mathematic formulation of the decoder, we show that under some mild conditions, the architecture could be simplified by compressing its sub-layers, the basic building block of Transformer, and achieves a higher parallelism. We thereby propose Compressed Attention Network, whose decoder layer consists of only one sub-layer instead of three. Extensive experiments on 14 WMT machine translation tasks show that our model is 1.42x faster with performance on par with a strong baseline. This strong baseline is already 2x faster than the widely used standard baseline without loss in performance.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

May-11-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - United States
    - Texas > Travis County
      - Austin (0.04)
    - New York > New York County
      - New York City (0.04)
    - Nevada > Clark County
      - Las Vegas (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - California > Los Angeles County
      - Long Beach (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe
  - Germany > Berlin (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - Italy
    - Tuscany > Florence (0.04)
    - Sardinia (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Macao (0.04)
  - China
    - Liaoning Province > Shenyang (0.04)
    - Hong Kong (0.04)
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (1.00)
  - Natural Language > Machine Translation (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found