Deep SOR Minimax Q-learning for Two-player Zero-sum Game

Open in new window