Reinforcement Learning with Wasserstein Distance Regularisation, with Applications to Multipolicy Learning

Open in new window