UMoE: Unifying Attention and FFN with Shared Experts

Open in new window