Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs