Is Q-learning Provably Efficient?

Open in new window