Judging with Confidence: Calibrating Autoraters to Preference Distributions

Open in new window