EcoTransformer: Attention without Multiplication

Gao, Xin, Xu, Xingming, Amiraslani, Shirin, Xu, Hong

Aug-7-2025–arXiv.org Artificial Intelligence

The Transformer, with its scaled dot-product attention mechanism, has become a foundational architecture in modern AI. However, this mechanism is computationally intensive and incurs substantial energy costs. We propose a new Transformer architecture EcoTransformer, in which the output context vector is constructed as the convolution of the values using a Laplacian kernel, where the distances are measured by the L1 metric between the queries and keys. Compared to dot-product based attention, the new attention score calculation is free of matrix multiplication. It performs on par with, or even surpasses, scaled dot-product attention in NLP, bioinformatics, and vision tasks, while consuming significantly less energy. (This version (v2) supersedes v1 and reflects the intended release and licensing.)

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Aug-7-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.28)

Genre:
- Research Report (1.00)

Industry:
- Energy (0.94)
- Health & Medicine
  - Pharmaceuticals & Biotechnology (0.93)
  - Therapeutic Area > Oncology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language (1.00)
  - Representation & Reasoning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found