Carrot and Stick: Eliciting Comparison Data and Beyond

Neural Information Processing Systems 

Comparison data elicited from people are fundamental to many machine learning tasks, including reinforcement learning from human feedback for large language models and estimating ranking models. They are typically subjective and not directly verifiable.