Conformal Prediction with Large Language Models for Multi-Choice Question Answering

Kumar, Bhawesh, Lu, Charlie, Gupta, Gauri, Palepu, Anil, Bellamy, David, Raskar, Ramesh, Beam, Andrew

Jul-7-2023–arXiv.org Artificial Intelligence

As large language models continue to be widely developed, robust uncertainty quantification techniques will become crucial for their safe deployment in high-stakes scenarios. In this work, we explore how conformal prediction can be used to provide uncertainty quantification in language models for the specific task of multiple-choice question-answering. We find that the uncertainty estimates from conformal prediction are tightly correlated with prediction accuracy. This observation can be useful for downstream applications such as selective classification and filtering out low-quality predictions. We also investigate the exchangeability assumption required by conformal prediction to out-of-subject questions, which may be a more realistic scenario for many practical applications. Our work contributes towards more trustworthy and reliable usage of large language models in safety-critical situations, where robust guarantees of error rate are required.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Jul-7-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Oklahoma > Payne County
    - Cushing (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - Hawaii > Honolulu County
    - Honolulu (0.04)
- Asia
  - Singapore (0.04)
  - Middle East
    - Jordan (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.04)

Genre:
- Research Report (0.82)

Industry:
- Health & Medicine > Therapeutic Area
  - Infections and Infectious Diseases (0.46)
- Education > Curriculum
  - Subject-Specific Education (0.54)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.30)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found