Kernelized Reinforcement Learning with Order Optimal Regret Bounds

Open in new window