On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating