Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret

Open in new window