Adaptive Gating in Mixture-of-Experts based Language Models