RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation