Stability of Q-Learning Through Design and Optimism