On Subjective Uncertainty Quantification and Calibration in Natural Language Generation

Jun-7-2024–arXiv.org Machine Learning

An example of this is question answering (QA): given a question from the user, the model may provide a brief answer, but it may also follow with supporting facts and explanations, which can vary in form and detail. The user can be satisfied by a wide variety of responses, irrespective of their style or (to some extent) the choice of supporting facts included. Free-form NLG poses significant challenges to uncertainty quantification: some aspects of generation are irrelevant to the task's purpose and best excluded from uncertainty quantification, but it often appears that we are unable to characterize them precisely. If left unaddressed, however, the model's variation in the irrelevant aspects may dominate in standard uncertainty measures such as token-level entropy (Kuhn et al., 2023), making them uninformative about the model's actual performance on the task. Starting from Kuhn et al. (2023), a recent line of work (Kuhn et al., 2023; Lin et al., 2024; Zhang et al., 2023; Aichberger et al., 2024) studied this issue and proposed measuring the "semantic uncertainty" of generation; "semantics" is defined as the equivalence class of textual responses that logically entail one another. Empirical improvements in downstream tasks evidenced their contributions and highlighted the importance of task-specific uncertainty quantification, but important conceptual and practical issues remain. From a practical perspective, semantic equivalence is estimated using machine learning models, resulting in imprecise estimates that do not necessarily define an equivalence relation.

large language model, machine learning, natural language, (22 more...)

arXiv.org Machine Learning

Jun-7-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Italy (0.14)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.14)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.94)
    - Neural Networks > Deep Learning (0.70)
  - Natural Language
    - Chatbot (0.70)
    - Generation (0.87)
    - Large Language Model (0.96)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found