Span-Agnostic Optimal Sample Complexity and Oracle Inequalities for Average-Reward RL

Open in new window