LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning