Bridging RL Theory and Practice with the Effective Horizon

Dec-26-2025, 15:16:20 GMT–Neural Information Processing Systems

Deep reinforcement learning (RL) works impressively in some environments and fails catastrophically in others. Ideally, RL theory should be able to provide an understanding of why this is, i.e. bounds predictive of practical performance. Unfortunately, current theory does not quite have this ability. We compare standard deep RL algorithms to prior sample complexity bounds by introducing a new dataset, BRIDGE. It consists of 155 MDPs from common deep RL benchmarks, along with their corresponding tabular representations, which enables us to exactly compute instance-dependent bounds.

bridging rl theory and practice, effective horizon, name change, (6 more...)

Neural Information Processing Systems

Dec-26-2025, 15:16:20 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)