Wasserstein Robust Reinforcement Learning

Abdullah, Mohammed Amin, Ren, Hang, Ammar, Haitham Bou, Milenkovic, Vladimir, Luo, Rui, Zhang, Mingtian, Wang, Jun

arXiv.org Artificial Intelligence 

Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes WR$^{2}$L; a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a Wasserstein constraint for a correct and convergent solver. Apart from the formulation, we also propose an efficient and scalable solver following a novel zero-order optimisation method that we believe can be useful to numerical optimisation in general. We contribute both theoretically and empirically. On the theory side, we prove that WR$^{2}$L converges to a stationary point in the general setting of continuous state and action spaces. Empirically, we demonstrate significant gains compared to standard and robust state-of-the-art algorithms on high-dimensional MuJuCo environments.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found