Near-Optimal Distributionally Robust Reinforcement Learning with General L Norms Pierre Clavier

May-28-2025, 07:14:32 GMT–Neural Information Processing Systems

To address the challenges of sim-to-real gap and sample efficiency in reinforcement learning (RL), this work studies distributionally robust Markov decision processes (RMDPs) -- optimize the worst-case performance when the deployed environment is within an uncertainty set around some nominal MDP. Despite recent efforts, the sample complexity of RMDPs has remained largely undetermined. While the statistical implications of distributional robustness in RL have been explored in some specific cases, the generalizability of the existing findings remains unclear, especially in comparison to standard RL.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

May-28-2025, 07:14:32 GMT

Conferences PDF

Add feedback

Country:
- Europe > France
  - Île-de-France (0.14)
- North America > United States (0.93)

Genre:
- Research Report > Experimental Study (0.92)

Industry:
- Government (0.67)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)