Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

Open in new window