SynChart: Synthesizing Charts from Language Models

Liu, Mengchen, Li, Qixiu, Chen, Dongdong, Chen, Dong, Bao, Jianmin, Li, Yunsheng

Sep-24-2024–arXiv.org Artificial Intelligence

Since the release of GPT-4V(O), using them to generate pseudo labels for multi-modality tasks has become more and more popular [1] While we often "stand on the shoulders of giants," the process of building the giant itself--specifically, constructing GPT-4V(O) from its foundational large language model (LLM), GPT-4--remains a mystery. In this work, we explore the potential of using LLMs alone to build a competitive multi-modality model. Given budget constraints, we focus on a specific domain--chart understanding--rather than building a general multi-modality model. Since the quantity and quality of data are key determinants of model performance, this work focuses on building a large-scale chart dataset and applying well-established training pipelines. There are two major challenges in constructing such a dataset: first, collecting a diverse set of chart images, and second, the more critical and difficult task of obtaining high-quality labels for these images.

chart image, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Sep-24-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment (0.68)
- Media > Television (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found