Intra-Trajectory Consistency for Reward Modeling