Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function

Open in new window