Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

Open in new window