Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
Wang, Dong, Li, Yang, Ni, Ansong, Yeh, Ching-Feng, Emad, Youssef, Lei, Xinjie, Robbins, Liam, Padthe, Karthik, Xu, Hu, Li, Xian, Celikyilmaz, Asli, Raghavendra, Ramya, Huang, Lifei, Wu, Carole-Jean, Li, Shang-Wen
–arXiv.org Artificial Intelligence
Synthetic data has become increasingly important for training large language models, especially when real data is scarce, expensive, or privacy-sensitive. Many such generation tasks require coordinated multi-agent workflows, where specialized agents collaborate to produce data that is higher quality, more diverse, and structurally richer. However, existing frameworks for multi-agent synthesis often depend on a centralized orchestrator, creating scalability bottlenecks, or are hardcoded for specific domains, limiting flexibility. We present \textbf{Matrix}, a decentralized framework that represents both control and data flow as serialized messages passed through distributed queues. This peer-to-peer design eliminates the central orchestrator. Each task progresses independently through lightweight agents, while compute-intensive operations, such as LLM inference or containerized environments, are handled by distributed services. Built on Ray, Matrix scales to tens of thousands of concurrent agentic workflows and provides a modular, configurable design that enables easy adaptation to a wide range of data generation workflows. We evaluate Matrix across diverse synthesis scenarios, such as multi-agent collaborative dialogue, web-based reasoning data extraction, and tool-use trajectory generation in customer service environments. In all cases, Matrix achieves $2$--$15\times$ higher data generation throughput under identical hardware resources, without compromising output quality.
arXiv.org Artificial Intelligence
Nov-27-2025
- Country:
- Asia
- Middle East
- Jordan (0.04)
- Palestine > Gaza Strip
- Rafah Governorate > Rafah (0.04)
- Saudi Arabia > Asir Province
- Abha (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Middle East
- North America > United States
- California > San Diego County > Carlsbad (0.04)
- Asia
- Genre:
- Research Report (0.66)
- Workflow (0.76)
- Technology: