CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation

Open in new window