Improving Autoregressive NLP Tasks via Modular Linearized Attention

Jun-24-2023–arXiv.org Artificial Intelligence

Various natural language processing (NLP) tasks necessitate models that are efficient and small based on their ultimate application at the edge or other resource-constrained environment. While prior research has reduced the size of these models, increasing computational efficiency without considerable performance impacts remains difficult, especially for autoregressive tasks. This paper proposes modular linearized attention (MLA), which combines multiple efficient attention mechanisms, including cosFormer [36], to maximize inference quality while achieving notable speedups.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jun-24-2023

arXiv.org PDF

Add feedback

Country:
- Asia (0.04)
- North America
  - United States > Oregon (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - Belgium (0.04)
  - Italy > Tuscany
    - Florence (0.04)

Genre:
- Research Report (0.52)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks (1.00)
  - Speech > Speech Recognition (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found