MegaBlocks: Efficient Sparse Training with Mixture-of-Experts

Open in new window