ChemLabs on ChemO: A Multi-Agent System for Multimodal Reasoning on IChO 2025

Xu, Qiang, Bai, Shengyuan, Chen, Leqing, Liu, Zijing, Li, Yu

Dec-10-2025–arXiv.org Artificial Intelligence

Olympiad-level benchmarks in mathematics and physics are crucial testbeds for advanced AI reasoning, but chemistry, with its unique multimodal symbolic language, has remained an open challenge. W e introduce ChemO, a new benchmark built from the International Chemistry Olympiad (IChO) 2025. ChemO features two key innovations for automated assessment: Assessment-Equivalent Reformulation (AER), which converts problems requiring visual outputs (e.g., drawing molecules) into computationally tractable formats, and Structured Visual Enhancement (SVE), a diagnostic mechanism to disentangle a model's visual perception capabilities from its core chemical reasoning. T o tackle this benchmark, we propose ChemLabs, a hierarchical multi-agent framework that mimics human expert collaboration through specialized agents for problem decomposition, perception, reasoning, and auditing. Experiments on state-of-the-art multimodal models demonstrate that combining SVE with our multi-agent system yields dramatic performance gains. Our top configuration achieves a score of 93.6 out of 100, surpassing an estimated human gold medal threshold and establishing a new state-of-the-art in automated chemical problem-solving.

artificial intelligence, arxiv preprint arxiv, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Dec-10-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.28)

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment > Sports (0.67)
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found