Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
–Neural Information Processing Systems
More specifically, the discounted MDP is one of the standard MDPs in reinforcement learning to describe sequential tasks without interruption or restart.
Neural Information Processing Systems
Aug-17-2025, 02:58:02 GMT
- Country:
- North America > United States
- California > Los Angeles County > Los Angeles (0.28)
- Europe > United Kingdom
- England
- Greater London > London (0.04)
- Cambridgeshire > Cambridge (0.04)
- England
- North America > United States
- Technology: