The Calibration Gap between Model and Human Confidence in Large Language Models