Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline

Open in new window