Robust Preference Optimization via Dynamic Target Margins