Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration

Open in new window