Breaking the MoE LLM Trilemma: Dynamic Expert Clustering with Structured Compression

Open in new window