Learning to Reason with Mixture of Tokens