SynChart: Synthesizing Charts from Language Models

Liu, Mengchen, Li, Qixiu, Chen, Dongdong, Chen, Dong, Bao, Jianmin, Li, Yunsheng

arXiv.org Artificial Intelligence 

Since the release of GPT-4V(O), using them to generate pseudo labels for multi-modality tasks has become more and more popular [1] While we often "stand on the shoulders of giants," the process of building the giant itself--specifically, constructing GPT-4V(O) from its foundational large language model (LLM), GPT-4--remains a mystery. In this work, we explore the potential of using LLMs alone to build a competitive multi-modality model. Given budget constraints, we focus on a specific domain--chart understanding--rather than building a general multi-modality model. Since the quantity and quality of data are key determinants of model performance, this work focuses on building a large-scale chart dataset and applying well-established training pipelines. There are two major challenges in constructing such a dataset: first, collecting a diverse set of chart images, and second, the more critical and difficult task of obtaining high-quality labels for these images.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found