PILAF: Optimal Human Preference Sampling for Reward Modeling