Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms