Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents

Open in new window