STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning