Efficient Training of Diffusion Mixture-of-Experts Models: A Practical Recipe