CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges