Wasserstein Robust Reinforcement Learning
Abdullah, Mohammed Amin, Ren, Hang, Ammar, Haitham Bou, Milenkovic, Vladimir, Luo, Rui, Zhang, Mingtian, Wang, Jun
–arXiv.org Artificial Intelligence
Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes WR$^{2}$L; a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a Wasserstein constraint for a correct and convergent solver. Apart from the formulation, we also propose an efficient and scalable solver following a novel zero-order optimisation method that we believe can be useful to numerical optimisation in general. We contribute both theoretically and empirically. On the theory side, we prove that WR$^{2}$L converges to a stationary point in the general setting of continuous state and action spaces. Empirically, we demonstrate significant gains compared to standard and robust state-of-the-art algorithms on high-dimensional MuJuCo environments.
arXiv.org Artificial Intelligence
Aug-10-2019
- Country:
- Europe (1.00)
- North America > United States (0.67)
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Education (0.48)
- Technology: