Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training

Open in new window