Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

Open in new window