Scaling Vision with Sparse Mixture of Experts

Open in new window