Off-Policy Evaluation for Human Feedback