EcoTransformer: Attention without Multiplication
Gao, Xin, Xu, Xingming, Amiraslani, Shirin, Xu, Hong
–arXiv.org Artificial Intelligence
The Transformer, with its scaled dot-product attention mechanism, has become a foundational architecture in modern AI. However, this mechanism is computationally intensive and incurs substantial energy costs. We propose a new Transformer architecture EcoTransformer, in which the output context vector is constructed as the convolution of the values using a Laplacian kernel, where the distances are measured by the L1 metric between the queries and keys. Compared to dot-product based attention, the new attention score calculation is free of matrix multiplication. It performs on par with, or even surpasses, scaled dot-product attention in NLP, bioinformatics, and vision tasks, while consuming significantly less energy. (This version (v2) supersedes v1 and reflects the intended release and licensing.)
arXiv.org Artificial Intelligence
Aug-7-2025
- Country:
- Europe > Ireland (0.04)
- North America
- Canada
- United States > California
- Yolo County > Davis (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Energy (0.94)
- Health & Medicine
- Pharmaceuticals & Biotechnology (0.93)
- Therapeutic Area > Oncology (1.00)
- Technology: