Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style

Li, Yuepei, Zhou, Kang, Qiao, Qiao, Nguyen, Bach, Wang, Qing, Li, Qi

Sep-17-2024–arXiv.org Artificial Intelligence

Retrieval-augmented generation (RAG) improves Large Language Models (LLMs) by incorporating external information into the response generation process. However, how context-faithful LLMs are and what factors influence LLMs' context-faithfulness remain largely unexplored. In this study, we investigate the impact of memory strength and evidence presentation on LLMs' receptiveness to external evidence. We introduce a method to quantify the memory strength of LLMs by measuring the divergence in LLMs' responses to different paraphrases of the same question, which is not considered by previous works. We also generate evidence in various styles to evaluate the effects of evidence in different styles. Two datasets are used for evaluation: Natural Questions (NQ) with popular questions and popQA featuring long-tail questions. Our results show that for questions with high memory strength, LLMs are more likely to rely on internal memory, particularly for larger LLMs such as GPT-4. On the other hand, presenting paraphrased evidence significantly increases LLMs' receptiveness compared to simple repetition or adding details.

dataset, llm, memory strength, (13 more...)

arXiv.org Artificial Intelligence

Sep-17-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Michigan (0.04)
    - Ohio (0.04)
    - Washington > King County
      - Seattle (0.04)
    - Iowa > Story County
      - Ames (0.04)
    - Illinois > Cook County
      - Chicago (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
- Asia
  - Singapore (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - Middle East
    - Jordan (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.04)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)