Post-Episodic Reinforcement Learning Inference