The Order Effect: Investigating Prompt Sensitivity in Closed-Source LLMs
Guan, Bryan, Roosta, Tanya, Passban, Peyman, Rezagholizadeh, Mehdi
–arXiv.org Artificial Intelligence
As large language models (LLMs) become integral to diverse applications, ensuring their reliability under varying input conditions is crucial. One key issue affecting this reliability is order sensitivity, wherein slight variations in input arrangement can lead to inconsistent or biased outputs. Although recent advances have reduced this sensitivity, the problem remains unresolved. This paper investigates the extent of order sensitivity in closed-source LLMs by conducting experiments across multiple tasks, including paraphrasing, relevance judgment, and multiple-choice questions. Our results show that input order significantly affects performance across tasks, with shuffled inputs leading to measurable declines in output accuracy. Few-shot prompting demonstrates mixed effectiveness and offers partial mitigation, however, fails to fully resolve the problem. These findings highlight persistent risks, particularly in high-stakes applications, and point to the need for more robust LLMs or improved input-handling techniques in future development. In recent years, large language models (LLMs) have become essential across various applications, helping users complete tasks in diverse domains, thanks to their remarkable abilities in understanding, analyzing, and generating text (Shen et al., 2023a; Yu et al., 2023). However, LLMs are not without their problems and risks. Many of these issues, such as bias (Talat et al., 2022; Motoki et al., 2023), hallucination (Chen et al., 2023; Sadat et al., 2023), consistency (Tam et al., 2023; Ye et al., 2023), and reliability (Shen et al., 2023b) have been extensively discussed in the literature. However, a more fundamental challenge to the long-term success of LLMs is their ability to reason: the distinguishing factor between probabilistic pattern matching and logical understanding. This distinction has significant implications for the future of LLMs and how we employ these models in decision-making. One necessary requirement for reasoning is order independence.
arXiv.org Artificial Intelligence
Feb-6-2025
- Country:
- North America > United States > Minnesota
- Hennepin County > Minneapolis (0.14)
- Stearns County (0.04)
- North America > United States > Minnesota
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Technology: