Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

Open in new window