Know Your Limits: A Survey of Abstention in Large Language Models

Wen, Bingbing, Yao, Jihan, Feng, Shangbin, Xu, Chenjun, Tsvetkov, Yulia, Howe, Bill, Wang, Lucy Lu

Aug-8-2024–arXiv.org Artificial Intelligence

But questions of Large language models (LLMs) have demonstrated human values and the answerability of the query generalization capabilities across NLP tasks such itself are difficult to model in terms of model confidence as question answering (QA) (Wei et al., 2022; (Yang et al., 2023). Chowdhery et al., 2022), abstractive summarization (Zhang et al., 2023a), and dialogue generation While prior work demonstrates the potential of (Yi et al., 2024). But these models are also unreliable, abstention in enhancing model safety and reliability having a tendency to "hallucinate" false information (Varshney et al., 2023; Wang et al., 2024c; in their responses (Ji et al., 2023b), generate Zhang et al., 2024a), the study of abstention has overly certain or authoritative responses (Zhou also been constrained to specific QA tasks. This et al., 2024b), answer with incomplete information task-specific approach limits the broader applicability (Zhou et al., 2023b), or produce harmful or of abstention strategies across the diverse dangerous responses (Anwar et al., 2024). In these range of scenarios encountered by general-purpose situations, the model should ideally abstain: to chatbots engaging in open-domain interactions.

abstention, computational linguistic, language model, (14 more...)

arXiv.org Artificial Intelligence

Aug-8-2024

arXiv.org PDF

Add feedback

Country:
- Africa (0.04)
- Asia
  - China (0.04)
  - Middle East > Jordan (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - Russia (0.04)
  - Singapore (0.05)
- Europe
  - Belgium (0.04)
  - Germany (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Latvia > Lubāna Municipality
    - Lubāna (0.04)
  - Russia (0.04)
  - United Kingdom > Scotland
    - Midlothian (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - Dominican Republic (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - United States > New York
    - New York County > New York City (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- South America > Brazil (0.04)

Genre:
- Overview (1.00)

Industry:
- Education (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Performance Analysis > Accuracy (0.67)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)