Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?

Open in new window