Understanding the Effects of Iterative Prompting on Truthfulness

Krishna, Satyapriya, Agarwal, Chirag, Lakkaraju, Himabindu

arXiv.org Artificial Intelligence 

The advent and rapid evolution of Large Language Models (LLMs) represent a profound shift in the artificial intelligence landscape [1, 2]. These models, distinguished by their significant learning capabilities, have demonstrated exceptional aptitude in generating coherent and contextually relevant text[3]. This prowess has rendered them invaluable across diverse sectors, including finance, healthcare, and autonomous systems, revolutionizing conventional approaches to tasks in these domains [4-7]. Nevertheless, the advent of LLMs into various societal aspects has also heightened the scrutiny of their reliability, especially the integrity of their generated content [8]. Amidst their impressive feats, LLMs' consistency in delivering accurate and verifiable information remains a pertinent concern [9, 10]. Instances of models producing misleading information or showcasing unwarranted confidence in incorrect outputs have underscored the imperative for ensuring the veracity of LLM outputs, notably in critical sectors where precision and factual accuracy are non-negotiable [11]. The phenomenon of "hallucination," wherein models fabricate information, has catalyzed the urgency to amplify the truthfulness of LLMs, positioning it as a pivotal research focus with substantial implications on future model refinement and application [12].