Applying Linearly Scalable Transformers to Model Longer Protein Sequences

#artificialintelligence 

In a bid to make transformer models even better for real-world applications, researchers from Google, University of Cambridge, DeepMind and Alan Turing Institute have proposed a new transformer architecture called "Performer" -- based on what they call fast attention via orthogonal random features (FAVOR). Believed to be particularly well suited for language understanding tasks when proposed in 2017, transformer is a novel neural network architecture based on a self-attention mechanism. To date, in addition to achieving SOTA performance in Natural Language Processing and Neural Machine Translation tasks, transformer models have also performed well across other machine learning (ML) tasks such as document generation/summarization, time series prediction, image generation, and analysis of biological sequences. Neural networks usually process language by generating fixed- or variable-length vector-space representations. A transformer however only performs a small, constant number of steps -- in each step, it applies a self-attention mechanism that can directly model relationships between all words in a sentence, regardless of their respective position.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found