Efficient Reinforcement Learning via Large Language Model-based Search