Reward-Augmented Data Enhances Direct Preference Alignment of LLMs