A Picture Is Worth a Graph: Blueprint Debate on Graph for Multimodal Reasoning
Zheng, Changmeng, Liang, Dayong, Zhang, Wengyu, Wei, Xiao-Yong, Chua, Tat-Seng, Li, Qing
–arXiv.org Artificial Intelligence
This paper presents a pilot study aimed at introducing multi-agent debate into multimodal reasoning. The study addresses two key challenges: the trivialization of opinions resulting from excessive summarization and the diversion of focus caused by distractor concepts introduced from images. These challenges stem from the inductive (bottom-up) nature of existing debating schemes. To address the issue, we propose a deductive (top-down) debating approach called Blueprint Debate on Graphs (BDoG). In BDoG, debates are confined to a blueprint graph to prevent opinion trivialization through world-level summarization. Moreover, by storing evidence in branches within the graph, BDoG mitigates distractions caused by frequent but irrelevant concepts. Extensive experiments validate BDoG, achieving state-of-the-art results in Science QA and MMBench with significant improvements over previous methods.
arXiv.org Artificial Intelligence
Mar-22-2024
- Country:
- Atlantic Ocean > Caribbean Sea (0.04)
- North America
- Dominica (0.06)
- Haiti (0.04)
- Barbados (0.04)
- Trinidad and Tobago (0.04)
- Dominican Republic (0.04)
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Consumer Products & Services > Travel (0.46)
- Technology: