Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Open in new window