Efficient Content-Based Sparse Attention with Routing Transformers

Open in new window