Worst-Case Regret Bounds for Exploration via Randomized Value Functions
–Neural Information Processing Systems
This paper studies a recent proposal to use randomized value functions to drive exploration in reinforcement learning.
Neural Information Processing Systems
Oct-2-2025, 15:29:10 GMT
- Country:
- Europe > United Kingdom > England (0.28)
- Genre:
- Research Report (0.34)
- Overview (0.34)
- Technology: