Uncertainty Quantification for Large Language Model Reward Learning under Heterogeneous Human Feedback

Open in new window