SALT: Step-level Advantage Assignment for Long-horizon Agents via Trajectory Graph

Open in new window