Minimax Optimal Reinforcement Learning with Quasi-Optimism

Open in new window