Maximum Entropy Reinforcement Learning with Mixture Policies