Reinforcement Learning with Logarithmic Regret and Policy Switches
–Neural Information Processing Systems
In this paper, we study the problem of regret minimization for episodic Reinforcement Learning (RL) both in the model-free and the model-based setting.
Neural Information Processing Systems
Aug-19-2025, 16:08:00 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
- Genre:
- Research Report (0.46)
- Technology: