Credence Calibration Game? Calibrating Large Language Models through Structured Play

Open in new window