CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency

Ota, Hirofumi, Iwase, Naoto, Ichihara, Yuki, Komiyama, Junpei, Imaizumi, Masaaki

May-8-2026–arXiv.org Machine Learning

Large language models often improve reasoning by sampling multiple outputs and aggregating their final answers, but precise and efficient control of error levels remains a challenging task. In particular, deciding when to stop sampling remains difficult when the stopping rule is data-dependent and the set of possible response labels is not known in advance. We study anytime-valid certification of a prespecified target answer as the unique mode of the model's response distribution, a guarantee distinct from answer correctness. We propose the Certification by Intersection-union Testing with Eprocesses (CITE) algorithm, which provably controls false certification at any prescribed level under arbitrary data-driven stopping, without requiring prior knowledge of the answer category set. We also prove a category-set-size-free stopping-time rate, establish matching minimax lower bounds up to constants in the main regime, and extend the construction to confidence-weighted voting. Simulations and LLM self-consistency experiments show empirical error control and improved certification in diffuse-tail settings.

category, large language model, natural language, (17 more...)

arXiv.org Machine Learning

May-8-2026

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found