Is Q-Learning Provably Efficient? An Extended Analysis

Open in new window