AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference

Open in new window