Decentralized Q-Learning in Zero-sum Markov Games