Training Task Reasoning LLM Agents for Multi-turn Task Planning via Single-turn Reinforcement Learning

Open in new window