TightRegretBoundsforModel-Based Reinforcement LearningwithGreedyPolicies
–Neural Information Processing Systems
The results are based on anovelanalysis ofreal-time dynamic programming, thenextended tomodel-based RL.Specifically,wegeneralize existing algorithms that perform full-planning to act by 1-step planning.
Neural Information Processing Systems
Feb-11-2026, 17:45:22 GMT
- Country:
- Technology: