A Closer Look into Mixture-of-Experts in Large Language Models