Deep SOR Minimax Q-learning for Two-player Zero-sum Game