Calibrating the Confidence of Large Language Models by Eliciting Fidelity

Zhang, Mozhi, Huang, Mianqiu, Shi, Rundong, Guo, Linsen, Peng, Chong, Yan, Peng, Zhou, Yaqian, Qiu, Xipeng

Apr-3-2024–arXiv.org Artificial Intelligence

Large language models optimized with techniques like RLHF have achieved good alignment in being helpful and harmless. However, post-alignment, these language models often exhibit overconfidence, where the expressed confidence does not accurately calibrate with their correctness rate. In this paper, we decompose the language model confidence into the \textit{Uncertainty} about the question and the \textit{Fidelity} to the answer generated by language models. Then, we propose a plug-and-play method to estimate the confidence of language models. Our method has shown good calibration performance by conducting experiments with 6 RLHF-LMs on four MCQA datasets. Moreover, we propose two novel metrics, IPR and CE, to evaluate the calibration of the model, and we have conducted a detailed discussion on \textit{Truly Well-Calibrated Confidence}. Our method could serve as a strong baseline, and we hope that this work will provide some insights into the model confidence calibration.

calibration, dataset, language model, (15 more...)

arXiv.org Artificial Intelligence

Apr-3-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Singapore (0.04)
- North America
  - United States
    - New York (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
  - Canada > Ontario
    - Toronto (0.04)
- Europe > Ireland
  - Leinster > County Dublin > Dublin (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found