Thompson Sampling for Parameterized Markov Decision Processes with Uninformative Actions

Open in new window