Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation

Open in new window