Cluster Topology-Driven Placement of Experts Reduces Network Traffic in MoE Inference

Open in new window