EIT: Enhanced Interactive Transformer

Zheng, Tong, Li, Bei, Bao, Huiwen, Xiao, Tong, Zhu, Jingbo

Dec-20-2022–arXiv.org Artificial Intelligence

In this paper, we propose a novel architecture, the Enhanced Interactive Transformer (EIT), to address the issue of head degradation in self-attention mechanisms. Our approach replaces the traditional multi-head self-attention mechanism with the Enhanced Multi-Head Attention (EMHA) mechanism, which relaxes the one-to-one mapping constraint among queries and keys, allowing each query to attend to multiple keys. Furthermore, we introduce two interaction models, Inner-Subspace Interaction and Cross-Subspace Interaction, to fully utilize the many-to-many mapping capabilities of EMHA. Extensive experiments on a wide range of tasks (e.g. machine translation, abstractive summarization, grammar correction, language modelling and brain disease automatic diagnosis) show its superiority with a very modest increase in model size.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Dec-20-2022

arXiv.org PDF

Add feedback

Country:
- Asia (0.67)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.68)
  - Natural Language > Machine Translation (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found