Preference Optimization by Estimating the Ratio of the Data Distribution