Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing