plackett-luce model
- North America > United States > New York > Rensselaer County > Troy (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.05)
- Europe > Hungary (0.04)
- Europe > Germany (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.47)
Personalized Recommendations via Active Utility-based Pairwise Sampling
Boroomand, Bahar, Wright, James R.
Recommender systems play a critical role in enhancing user experience by providing personalized suggestions based on user preferences. Traditional approaches often rely on explicit numerical ratings or assume access to fully ranked lists of items. However, ratings frequently fail to capture true preferences due to users' behavioral biases and subjective interpretations of rating scales, while eliciting full rankings is demanding and impractical. To overcome these limitations, we propose a generalized utility-based framework that learns preferences from simple and intuitive pairwise comparisons. Our approach is model-agnostic and designed to optimize for arbitrary, task-specific utility functions, allowing the system's objective to be explicitly aligned with the definition of a high-quality outcome in any given application. A central contribution of our work is a novel utility-based active sampling strategy for preference elicitation. This method selects queries that are expected to provide the greatest improvement to the utility of the final recommended outcome. We ground our preference model in the probabilistic Plackett-Luce framework for pairwise data. To demonstrate the versatility of our approach, we present two distinct experiments: first, an implementation using matrix factorization for a classic movie recommendation task, and second, an implementation using a neural network for a complex candidate selection scenario in university admissions. Experimental results demonstrate that our framework provides a more accurate, data-efficient, and user-centric paradigm for personalized ranking.
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
- North America > United States (0.05)
Preference Models assume Proportional Hazards of Utilities
Modelling of human preferences is an important step in modern post-training pipelines for AI alignment. One popular approach of building such models of human preference is assuming that human preference rankings assume a Plackett-Luce (Plackett, 1975; Luce et al., 1959) distribution. In this monograph, I draw a somewhat remarkable connection of the popular statistical model for estimating lifetimes, the Cox Proportional Hazard model (Cox, 1972) to the Plackett-Luce model and then consequently to algorithms such as Direct Preference Optimization, a popular algorithm for aligning modern Artifical Intelligence (Ouyang et al., 2022). To the best of my knowledge, at the time of writing the connection between the Proportional Hazards model and the Plackett-Luce is relatively little known, and the subsequent connections to the AI alignment algorithms such as'Direct Preference Optimization ' (Rafailov et al., 2023) are not well appreciated. I believe that explcitly stating this connection will help the AI research community build on existing research in semi-parametric statistics to build better models of human preference.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- North America > United States > New York > Rensselaer County > Troy (0.04)
- North America > United States > Washington > King County > Bellevue (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Surprisingly Popular Voting for Concentric Rank-Order Models
Hosseini, Hadi, Mandal, Debmalya, Puhan, Amrit
An important problem on social information sites is the recovery of ground truth from individual reports when the experts are in the minority. The wisdom of the crowd, i.e. the collective opinion of a group of individuals fails in such a scenario. However, the surprisingly popular (SP) algorithm~\cite{prelec2017solution} can recover the ground truth even when the experts are in the minority, by asking the individuals to report additional prediction reports--their beliefs about the reports of others. Several recent works have extended the surprisingly popular algorithm to an equivalent voting rule (SP-voting) to recover the ground truth ranking over a set of $m$ alternatives. However, we are yet to fully understand when SP-voting can recover the ground truth ranking, and if so, how many samples (votes and predictions) it needs. We answer this question by proposing two rank-order models and analyzing the sample complexity of SP-voting under these models. In particular, we propose concentric mixtures of Mallows and Plackett-Luce models with $G (\ge 2)$ groups. Our models generalize previously proposed concentric mixtures of Mallows models with $2$ groups, and we highlight the importance of $G > 2$ groups by identifying three distinct groups (expert, intermediate, and non-expert) from existing datasets. Next, we provide conditions on the parameters of the underlying models so that SP-voting can recover ground-truth rankings with high probability, and also derive sample complexities under the same. We complement the theoretical results by evaluating SP-voting on simulated and real datasets.
- North America > United States > Michigan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Strong Preferences Affect the Robustness of Value Alignment
Value alignment, which aims to ensure that large language models (LLMs) and other AI agents behave in accordance with human values, is critical for ensuring safety and trustworthiness of these systems. A key component of value alignment is the modeling of human preferences as a representation of human values. In this paper, we investigate the robustness of value alignment by examining the sensitivity of preference models. Specifically, we ask: how do changes in the probabilities of some preferences affect the predictions of these models for other preferences? To answer this question, we theoretically analyze the robustness of widely used preference models by examining their sensitivities to minor changes in preferences they model. Our findings reveal that, in the Bradley-Terry and the Placket-Luce model, the probability of a preference can change significantly as other preferences change, especially when these preferences are dominant (i.e., with probabilities near 0 or 1). We identify specific conditions where this sensitivity becomes significant for these models and discuss the practical implications for the robustness and safety of value alignment in AI systems.
- Asia > Singapore (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Bayesian nonparametric models for ranked data François Caron
We develop a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a gamma process. We derive a posterior characterization and a simple and effective Gibbs sampler for posterior simulation. We develop a time-varying extension of our model, and apply it to the New York Times lists of weekly bestselling books.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- (2 more...)
Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach
We study the problem of online rank elicitation, assuming that rankings of a set of alternatives obey the Plackett-Luce distribution. Following the setting of the dueling bandits problem, the learner is allowed to query pairwise comparisons between alternatives, i.e., to sample pairwise marginals of the distribution in an online fashion. Using this information, the learner seeks to reliably predict the most probable ranking (or top-alternative). Our approach is based on constructing a surrogate probability distribution over rankings based on a sorting procedure, for which the pairwise marginals provably coincide with the marginals of the Plackett-Luce distribution.
- North America > United States (0.05)
- Europe > Hungary (0.04)
- Europe > Germany (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.67)