DCT-Former: Efficient Self-Attention with Discrete Cosine Transform

Scribano, Carmelo, Franchini, Giorgia, Prato, Marco, Bertogna, Marko

arXiv.org Artificial Intelligence 

Transformers are a family of recently introduced Deep Learning (DL) models which leverage the mechanism of dot-product attention to map a sequence of tokens of arbitrary length into a new set of tokens. Thanks to their outstanding performance in a variety of tasks, transformers are nowadays ubiquitous in state-of-the-art techniques that gain any benefit from modeling long-term interactions between elements of a sequence. Another important advantage of transformers is the ability to process sequences of arbitrary length in a single forward pass without incurring the limitations of recurrent approaches: no other standard Machine Learning (ML) or DL methods in the literature have shown this great adaptability so far. In the domain of Natural Language Processing (NLP) transformers are pervasive in any sort of task, such as Machine Translation [1-4], text classification, document retrieval, document summarization and several others more. More recently, researchers started to focus on exploiting the benefits of the self-attention mechanism for computer vision tasks [5-7], either standalone or applied downstream to a convolutional backbone and even to multimodal problems where the language and visual input needs to be correlated.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found