Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning