Fragile Preferences: A Deep Dive Into Order Effects in Large Language Models

Yin, Haonan, Vardi, Shai, Choudhary, Vidyanand

arXiv.org Artificial Intelligence 

Large language models (LLMs) are increasingly deployed in decision-support systems for high-stakes domains such as hiring and university admissions, where choices often involve selecting among competing alternatives. While prior work has noted position order biases in LLM-driven comparisons, these biases have not been systematically analyzed or linked to underlying preference structures. We present the first comprehensive study of position biases across multiple LLMs and two distinct domains: resume comparisons, representing a realistic high-stakes context, and color selection, which isolates position effects by removing confounding factors. We find strong and consistent order effects, including a quality-dependent shift: when all options are high quality, models favor the first option, but when quality is lower, they favor later options. We also identify two previously undocumented biases in both human and machine decision-making: a centrality bias (favoring the middle position in triplewise comparisons) and a name bias, where certain names are favored despite controlling for demographic signals. To separate superficial tie-breaking from genuine distortions of judgment, we extend the rational choice framework to classify pairwise preferences as robust, fragile, or indifferent. Using this framework, we show that order effects can lead models to select strictly inferior options, and that position biases are typically stronger than gender biases. These results indicate that LLMs exhibit distinct failure modes not documented in human decision-making. We also propose targeted mitigation strategies, including a novel use of the temperature parameter, to recover underlying preferences when order effects distort model behavior.