ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs' Capability via Chart Editing
Zhao, Xuanle, Liu, Xuexin, Yang, Haoyue, Luo, Xianzhen, Zeng, Fanhu, Li, Jianling, Shi, Qi, Chen, Chi
–arXiv.org Artificial Intelligence
Although multimodal large language models (MLLMs) show promise in generating chart rendering code, editing charts via code presents a greater challenge. This task demands MLLMs to integrate chart understanding and reasoning capacities, which are labor-intensive. While many MLLMs claim such editing capabilities, current evaluations rely on limited case studies, highlighting the urgent need for a comprehensive evaluation framework. In this work, we propose \textsc{ChartEdit}, a novel benchmark designed for chart editing tasks, featuring $1405$ diverse editing instructions applied to $233$ real-world charts, each manually annotated and validated for accuracy. Utilizing \textsc{ChartEdit}, we evaluate the performance of 10 mainstream MLLMs across two types of experiments at both the code and chart levels. The results suggest that large-scale models can generate code to produce images that partially match the reference images. However, their ability to generate accurate edits according to the instructions remains limited. The state-of-the-art (SOTA) model achieves a score of only $59.96$, highlighting significant challenges in precise modification. In contrast, small-scale models, including chart-domain models, struggle both with following editing instructions and generating overall chart images, underscoring the need for further development in this area. Code is available at https://github.com/xxlllz/ChartEdit.
arXiv.org Artificial Intelligence
Aug-5-2025
- Country:
- Asia > China
- Beijing > Beijing (0.04)
- Heilongjiang Province > Harbin (0.04)
- Tianjin Province > Tianjin (0.04)
- Asia > China
- Genre:
- Research Report > New Finding (0.88)
- Technology: