Stronger Regret Bounds for Safe Online Reinforcement Learning in the Linear Quadratic Regulator

Open in new window