Improving Generalization of Reinforcement Learning with Minimax Distributional Soft Actor-Critic

Open in new window