I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token

May-26-2025, 16:54:12 GMT–Neural Information Processing Systems

Large Language Models are known to capture real-world knowledge, allowing them to excel in many downstream tasks. Despite recent advances, these models are still prone to what are commonly known as hallucinations, causing them to emit unwanted and factually incorrect text. In this work, we propose a novel calibration method that can be used to combat hallucinations. We add a special [IDK] ("I Don't Know") token to the model's vocabulary and introduce an objective function that shifts probability mass to the [IDK] token for incorrect predictions. This approach allows the model to express uncertainty in its output explicitly.

artificial intelligence, explicit modeling, natural language, (5 more...)

Neural Information Processing Systems

May-26-2025, 16:54:12 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language (0.64)