Semantic Specialization in MoE Appears with Scale: A Study of DeepSeek R1 Expert Specialization

Open in new window