Provably Efficient Infinite-Horizon Average-Reward Reinforcement Learning with Linear Function Approximation

Open in new window