RRO: LLM Agent Optimization Through Rising Reward Trajectories

Open in new window