Bridging Human and LLMJudgments: Understanding and Narrowing the Gap
–Neural Information Processing Systems
Large language models are increasingly used as judges (LLM-as-a-judge) to evaluate model outputs at scale, but their assessments often diverge systematically from human judgments.
Neural Information Processing Systems
Jun-15-2026, 02:32:37 GMT