Variational Best-of-N Alignment