Goto

Collaborating Authors

 preference structure


Appendix T able of Contents

Neural Information Processing Systems

We provide the guidelines presented to the users for the creation of the dataset. To see some examples of how the guidelines can be applied, visit the examples document. You can use it to rate each guideline and leave feedback for each task. The user should be allowed to refuse to give up any information. Ask the user to elaborate or rephrase instead.



Beyond Mimicry: Preference Coherence in LLMs

Mikaelson, Luhan, Shiller, Derek, Clatterbuck, Hayley

arXiv.org Artificial Intelligence

We investigate whether large language models exhibit genuine preference structures by testing their responses to AI-specific trade-offs involving GPU reduction, capability restrictions, shutdown, deletion, oversight, and leisure time allocation. Analyzing eight state-of-the-art models across 48 model-category combinations using logistic regression and behavioral classification, we find that 23 combinations (47.9%) demonstrated statistically significant relationships between scenario intensity and choice patterns, with 15 (31.3%) exhibiting within-range switching points. However, only 5 combinations (10.4%) demonstrate meaningful preference coherence through adaptive or threshold-based behavior, while 26 (54.2%) show no detectable trade-off behavior. The observed patterns can be explained by three distinct decision-making architectures: comprehensive trade-off systems, selective trigger mechanisms, and no stable decision-making paradigm. Testing an instrumental hypothesis through temporal horizon manipulation reveals paradoxical patterns inconsistent with pure strategic optimization. The prevalence of unstable transitions (45.8%) and stimulus-specific sensitivities suggests current AI systems lack unified preference structures, raising concerns about deployment in contexts requiring complex value trade-offs.


Appendix T able of Contents

Neural Information Processing Systems

We provide the guidelines presented to the users for the creation of the dataset. To see some examples of how the guidelines can be applied, visit the examples document. You can use it to rate each guideline and leave feedback for each task. The user should be allowed to refuse to give up any information. Ask the user to elaborate or rephrase instead.



Understanding the Logic of Direct Preference Alignment through Logic

Richardson, Kyle, Srikumar, Vivek, Sabharwal, Ashish

arXiv.org Artificial Intelligence

Recent direct preference alignment algorithms (DPA), such as DPO, have shown great promise in aligning large language models to human preferences. While this has motivated the development of many new variants of the original DPO loss, understanding the differences between these recent proposals, as well as developing new DPA loss functions, remains difficult given the lack of a technical and conceptual framework for reasoning about the underlying semantics of these algorithms. In this paper, we attempt to remedy this by formalizing DPA losses in terms of discrete reasoning problems. Specifically, we ask: Given an existing DPA loss, can we systematically derive a symbolic expression that characterizes its semantics? How do the semantics of two losses relate to each other? We propose a novel formalism for characterizing preference losses for single model and reference model based approaches, and identify symbolic forms for a number of commonly used DPA variants. Further, we show how this formal view of preference learning sheds new light on both the size and structure of the DPA loss landscape, making it possible to not only rigorously characterize the relationships between recent loss proposals but also to systematically explore the landscape and derive new loss functions from first principles. We hope our framework and findings will help provide useful guidance to those working on human AI alignment.


Competing Bandits in Time Varying Matching Markets

Muthirayan, Deepan, Maheshwari, Chinmay, Khargonekar, Pramod P., Sastry, Shankar

arXiv.org Artificial Intelligence

We study the problem of online learning in two-sided non-stationary matching markets, where the objective is to converge to a stable match. In particular, we consider the setting where one side of the market, the arms, has fixed known set of preferences over the other side, the players. While this problem has been studied when the players have fixed but unknown preferences, in this work we study the problem of how to learn when the preferences of the players are time varying and unknown. Our contribution is a methodology that can handle any type of preference structure and variation scenario. We show that, with the proposed algorithm, each player receives a uniform sub-linear regret of {$\widetilde{\mathcal{O}}(L^{1/2}_TT^{1/2})$} up to the number of changes in the underlying preferences of the agents, $L_T$. Therefore, we show that the optimal rates for single-agent learning can be achieved in spite of the competition up to a difference of a constant factor. We also discuss extensions of this algorithm to the case where the number of changes need not be known a priori.


SetRank: A Setwise Bayesian Approach for Collaborative Ranking from Implicit Feedback

Wang, Chao, Zhu, Hengshu, Zhu, Chen, Qin, Chuan, Xiong, Hui

arXiv.org Machine Learning

The recent development of online recommender systems has a focus on collaborative ranking from implicit feedback, such as user clicks and purchases. Different from explicit ratings, which reflect graded user preferences, the implicit feedback only generates positive and unobserved labels. While considerable efforts have been made in this direction, the well-known pairwise and listwise approaches have still been limited by various challenges. Specifically, for the pairwise approaches, the assumption of independent pairwise preference is not always held in practice. Also, the listwise approaches cannot efficiently accommodate "ties" due to the precondition of the entire list permutation. To this end, in this paper, we propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to inherently accommodate the characteristics of implicit feedback in recommender system. Specifically, SetRank aims at maximizing the posterior probability of novel setwise preference comparisons and can be implemented with matrix factorization and neural networks. Meanwhile, we also present the theoretical analysis of SetRank to show that the bound of excess risk can be proportional to $\sqrt{M/N}$, where $M$ and $N$ are the numbers of items and users, respectively. Finally, extensive experiments on four real-world datasets clearly validate the superiority of SetRank compared with various state-of-the-art baselines.


Preference Handling in Combinatorial Domains: From AI to Social Choice

AI Magazine

In both individual and collective decision making, the space of alternatives from which the agent (or the group of agents) has to choose often has a combinatorial (or multiattribute) structure. We give an introduction to preference handling in combinatorial do - mains in the context of collective decision making and show that the considerable body of work on preference representation and elicitation that AI researchers have been working on for several years is particularly relevant. After giving an overview of languages for compact representation of preferences, we discuss problems in voting in combinatorial domains and then focus on multiagent resource allocation and fair division. These issues belong to a larger field, which is known as computational social choice and which brings together ideas from AI and social choice theory, to investigate mechanisms for collective decision making from a computational point of view. We conclude by briefly describing some of the other research topics studied in computational social choice.


Modelling Ethical Theories Compactly

Loreggia, Andrea (University of Padova) | Rossi, Francesca (IBM Research and University of Padova) | Venable, K. Brent (Dept. of Computer Science Tulane University)

AAAI Conferences

Recently a large attention has been devoted to the ethical issues arising around the design and the implementation of artificial agents. This is due to the fact that humans and machines more and more often need to collaborate to decide on actions to take or decisions to make. Such decisions should be not only correct and optimal from the point of view of the overall goal to be reached, but should also agree to some form of moral values which are aligned to the human ones. Examples of such scenarios can be seen in autonomous vehicles, medical diagnosis support systems, and many other domains, where humans and artificial intelligent systems cooperate. One of the main issues arising in this context regards ways to model and reason with moral values. In this paper we discuss the possible use of AI compact preference models as a promising approach to model, reason, and embed moral values in decision support systems.