AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents
Xie, Jingxu, Xu, Dylan, Zhao, Xuandong, Song, Dawn
–arXiv.org Artificial Intelligence
We introduce AgentSynth, a scalable and cost-efficient pipeline for automatically synthesizing high-quality tasks and trajectory datasets for generalist computer-use agents. Leveraging information asymmetry, AgentSynth constructs subtasks that are simple during generation but significantly more challenging when composed into long-horizon tasks, enabling the creation of over 6,000 diverse and realistic tasks. Our pipeline begins with an LLM-based task proposer guided by a persona, followed by an execution agent that completes the task and logs the trajectory. This process is repeated iteratively to form a sequence of subtasks, which are then summarized by a separate agent into a composite task of controllable difficulty. A key strength of AgentSynth is its ability to precisely modulate task complexity by varying the number of subtasks. Empirical evaluations show that state-of-the-art LLM agents suffer a steep performance drop, from 18% success at difficulty level 1 to just 4% at level 6, highlighting the benchmark's difficulty and discriminative power. Moreover, our pipeline achieves a low average cost of \$0.60 per trajectory, orders of magnitude cheaper than human annotations. Our code and data are publicly available at https://github.com/sunblaze-ucb/AgentSynth
arXiv.org Artificial Intelligence
Jun-18-2025
- Country:
- North America > United States
- California > Alameda County
- Berkeley (0.04)
- Kentucky (0.05)
- California > Alameda County
- North America > United States
- Genre:
- Research Report (1.00)
- Workflow (1.00)
- Industry:
- Energy (0.96)
- Information Technology > Security & Privacy (0.93)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning (1.00)
- Natural Language > Large Language Model (1.00)
- Representation & Reasoning (1.00)
- Communications (1.00)
- Security & Privacy (0.93)
- Software (1.00)
- Artificial Intelligence
- Information Technology