Improving Generalization in Reinforcement Learning with Mixture Regularization

Neural Information Processing Systems 

Deep reinforcement learning (RL) agents trained in a limited set of environments tend to suffer overfitting and fail to generalize to unseen testing environments. To improve their generalizability, data augmentation approaches (e.g.