Supplementary Materials for M3 ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with M odel-Accelerator Co-design