Frequentist Regret Bounds for Randomized Least-Squares Value Iteration

Open in new window