Evaluation of Best-of-N Sampling Strategies for Language Model Alignment