Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model
Aaron Sidford, Mengdi Wang, Xian Wu, Lin Yang, Yinyu Ye
–Neural Information Processing Systems
Computing an approximately optimal policy with high probability in this case is known as PAC RL with a generative model.
Neural Information Processing Systems
Mar-13-2026, 14:20:54 GMT
- Country:
- Europe > United Kingdom
- England (0.04)
- North America
- Canada (0.04)
- United States
- Europe > United Kingdom