Is Q-Learning Provably Efficient? An Extended Analysis
Rastogi, Kushagra, Lee, Jonathan, Harel-Canada, Fabrice, Joglekar, Aditya
–arXiv.org Artificial Intelligence
This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by Jin et al. We include a survey of related research to contextualize the need for strengthening the theoretical guarantees related to perhaps the most important threads of model-free reinforcement learning. We also expound upon the reasoning used in the proofs to highlight the critical steps leading to the main result showing that Q-learning with UCB exploration achieves a sample efficiency that matches the optimal regret that can be achieved by any model-based approach.
arXiv.org Artificial Intelligence
Sep-22-2020
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.14)
- North America > United States
- California
- Los Angeles County > Los Angeles (0.14)
- San Francisco County > San Francisco (0.14)
- California
- Europe > United Kingdom
- Genre:
- Research Report > New Finding (0.48)
- Technology: