From Roots to Rewards: Dynamic Tree Reasoning with Reinforcement Learning