Distributed Multi-Agent Coordination Using Multi-Modal Foundation Models
Mahmud, Saaduddin, Goldfajn, Dorian Benhamou, Zilberstein, Shlomo
–arXiv.org Artificial Intelligence
Distributed Constraint Optimization Problems (DCOPs) offer a powerful framework for multi-agent coordination but often rely on labor-intensive, manual problem construction. To address this, we introduce VL-DCOPs, a framework that takes advantage of large multimodal foundation models (LFMs) to automatically generate constraints from both visual and linguistic instructions. We then introduce a spectrum of agent archetypes for solving VL-DCOPs: from a neuro-symbolic agent that delegates some of the algorithmic decisions to an LFM, to a fully neural agent that depends entirely on an LFM for coordination. We evaluate these agent archetypes using state-of-the-art LLMs (large language models) and VLMs (vision language models) on three novel VL-DCOP tasks and compare their respective advantages and drawbacks. Lastly, we discuss how this work extends to broader frontier challenges in the DCOP literature.
arXiv.org Artificial Intelligence
Jan-23-2025
- Country:
- North America > United States > Massachusetts (0.14)
- Genre:
- Research Report (0.82)
- Industry:
- Information Technology > Security & Privacy (0.46)
- Technology: