PreScope: Unleashing the Power of Prefetching for Resource-Constrained MoE Inference

Open in new window