Pay Attention to MLPs

Neural Information Processing Systems 

Our comparisons show that self-attention is not critical for Vision Transformers, as gMLP can achieve the same accuracy.