Preference Optimization as Probabilistic Inference