Worst-Case Regret Bounds for Exploration via Randomized Value Functions
–Neural Information Processing Systems
This paper studies a recent proposal to use randomized value functions to drive exploration in reinforcement learning.
Neural Information Processing Systems
Oct-2-2025, 15:29:10 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England
- Greater London > London (0.04)
- Oxfordshire > Oxford (0.04)
- England
- North America > Canada (0.04)
- Asia > Middle East
- Genre:
- Overview (0.34)
- Research Report (0.34)
- Technology: