Bayesian Calibration of Win Rate Estimation with LLM Evaluators

Open in new window