Preference Models assume Proportional Hazards of Utilities

Nagpal, Chirag

arXiv.org Machine Learning 

Modelling of human preferences is an important step in modern post-training pipelines for AI alignment. One popular approach of building such models of human preference is assuming that human preference rankings assume a Plackett-Luce (Plackett, 1975; Luce et al., 1959) distribution. In this monograph, I draw a somewhat remarkable connection of the popular statistical model for estimating lifetimes, the Cox Proportional Hazard model (Cox, 1972) to the Plackett-Luce model and then consequently to algorithms such as Direct Preference Optimization, a popular algorithm for aligning modern Artifical Intelligence (Ouyang et al., 2022). To the best of my knowledge, at the time of writing the connection between the Proportional Hazards model and the Plackett-Luce is relatively little known, and the subsequent connections to the AI alignment algorithms such as'Direct Preference Optimization ' (Rafailov et al., 2023) are not well appreciated. I believe that explcitly stating this connection will help the AI research community build on existing research in semi-parametric statistics to build better models of human preference.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found