Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms

Dec-31-1999–Neural Information Processing Systems

In this paper, we address two issues of longstanding interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning after only a finite number of actions? Second, what quantitative comparisons can be made between Q-learning and model-based (indirect) approaches, which use experience to estimate next-state distributions for off-line value iteration? We first show that both Q-learning and the indirect approach enjoy rather rapid convergence to the optimal policy as a function of the number of state transitions observed.

algorithm, artificial intelligence, reinforcement learning, (15 more...)

Neural Information Processing Systems

Dec-31-1999

Conferences PDF

Add feedback

Duplicate Docs Excel Report

Title
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms

Similar Docs Excel Report more

Title	Similarity	Source
None found