Enhancing Test-Time Scaling of Large Language Models with Hierarchical Retrieval-Augmented MCTS