Mitigating Reward Overoptimization via Lightweight Uncertainty Estimation

Open in new window