Reviews: Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle

Neural Information Processing Systems 

The paper proposes an adaptation of the classical Q-learning algorithm with linear function approximation that enjoys polynomial sample complexity. All reviewers feel the paper contains interesting contribution to the RL literature that should appear in this conference, and I therefore recommend acceptance.