Uncertainty Quantification and Decomposition for LLM-based Recommendation

Kweon, Wonbin, Jang, Sanghwan, Kang, SeongKu, Yu, Hwanjo

arXiv.org Artificial Intelligence 

Instruction-tuned for recommendation, we demonstrate that LLMs often exhibit uncertainty LLMs [4, 29, 64, 66] have shown remarkable performance for the in their recommendations. To ensure the trustworthy zero-shot ranking task [23, 25], and can be further fine-tuned with use of LLMs in generating recommendations, we emphasize the the user history logged on the system [2, 19, 81]. Recent methods importance of assessing the reliability of recommendations generated [10, 70, 79, 80] adopt the retrieval-augmented generation paradigm by LLMs. We start by introducing a novel framework for [3, 27], where LLMs are employed to generate ranking lists with candidates estimating the predictive uncertainty to quantitatively measure the retrieved by candidate generators. This approach exhibits reliability of LLM-based recommendations. We further propose to state-of-the-art recommendation performance over conventional decompose the predictive uncertainty into recommendation uncertainty sequential recommenders [31, 63], facilitating better online updates and prompt uncertainty, enabling in-depth analyses of and avoiding hallucination. the primary source of uncertainty. Through extensive experiments, While LLMs have been widely employed in real-world applications we (1) demonstrate predictive uncertainty effectively indicates the that can influence human behavior, there is a lack of exploration reliability of LLM-based recommendations, (2) investigate the origins in assessing the reliability of the LLM-based recommendation. of uncertainty with decomposed uncertainty measures, and Indeed, despite their superior performance, we demonstrate recommendations (3) propose uncertainty-aware prompting for a lower predictive generated by LLMs are highly volatile depending on uncertainty and enhanced recommendation. Our source code and the prompting details (e.g., word choice, number of user histories, model weights are available at https://github.com/WonbinKweon/