PAC Reinforcement Learning without Real-World Feedback