Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes

Open in new window