SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

Open in new window