Learning to Undo: Rollback-Augmented Reinforcement Learning with Reversibility Signals