Provably Efficient Offline Reinforcement Learning in Regular Decision Processes

Open in new window