Catch Me if You Search: When Contextual Web Search Results Affect the Detection of Hallucinations
Nahar, Mahjabin, Lee, Eun-Ju, Park, Jin Won, Lee, Dongwon
–arXiv.org Artificial Intelligence
While we increasingly rely on large language models (LLMs) for various tasks, these models are known to produce inaccurate content or 'hallucinations' with potentially disastrous consequences. The recent integration of web search results into LLMs prompts the question of whether people utilize them to verify the generated content, thereby accurately detecting hallucinations. An online experiment (N=560) investigated how the provision of search results, either static (i.e., fixed search results provided by LLM) or dynamic (i.e., participant-led searches), affects participants' perceived accuracy of LLM-generated content (i.e., genuine, minor hallucination, major hallucination), self-confidence in accuracy ratings, as well as their overall evaluation of the LLM, as compared to the control condition (i.e., no search results). Results showed that participants in both static and dynamic conditions (vs. control) rated hallucinated content to be less accurate and perceived the LLM more negatively. However, those in the dynamic condition rated genuine content as more accurate and demonstrated greater overall self-confidence in their assessments than those in the static search or control conditions. We highlighted practical implications of incorporating web search functionality into LLMs in real-world contexts.
arXiv.org Artificial Intelligence
Sep-18-2025
- Country:
- Africa > Middle East
- Egypt (0.04)
- Asia
- Japan (0.04)
- South Korea > Seoul
- Seoul (0.04)
- Europe
- North America
- Bermuda (0.04)
- Canada (0.14)
- United States
- Alaska (0.04)
- New York (0.14)
- Pennsylvania > Centre County
- University Park (0.04)
- Texas (0.04)
- Africa > Middle East
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Technology: