MSG-Chart: Multimodal Scene Graph for ChartQA
Dai, Yue, Han, Soyeon Caren, Liu, Wei
–arXiv.org Artificial Intelligence
Automatic Chart Question Answering (ChartQA) is challenging due to the complex distribution of chart elements with patterns of the underlying data not explicitly displayed in charts. To address this challenge, we design a joint multimodal scene graph for charts to explicitly represent the relationships between chart elements and their patterns. Our proposed multimodal scene graph includes a visual graph and a textual graph to jointly capture the structural and semantical knowledge from the chart. This graph Figure 1: Cutting-Edge LLMs and Our MSG-Chart module can be easily integrated with different vision transformers as inductive bias. Our experiments demonstrate that incorporating charts often include extensive text and numerical data, understanding the proposed graph module enhances the understanding of charts' these features is crucial for accurate question answering. While elements' structure and semantics, thereby improving performance recognizing the underlying text of an object is enough for data extraction on publicly available benchmarks, ChartQA and OpenCQA.
arXiv.org Artificial Intelligence
Aug-9-2024
- Country:
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- Oceania
- Solomon Islands (0.04)
- Australia
- Western Australia > Perth (0.14)
- Victoria > Melbourne (0.04)
- North America
- United States
- District of Columbia > Washington (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Idaho > Ada County
- Boise (0.05)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Canada
- Ontario > Toronto (0.05)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- United States
- Europe
- Asia
- Singapore (0.04)
- Indonesia > Bali (0.04)
- Middle East
- Jordan (0.05)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Israel > Tel Aviv District
- Tel Aviv (0.04)
- South America > Colombia
- Genre:
- Research Report (0.50)
- Technology:
- Information Technology
- Data Science > Data Mining
- Text Mining (0.34)
- Artificial Intelligence
- Vision (1.00)
- Machine Learning > Neural Networks (0.46)
- Natural Language
- Question Answering (0.55)
- Large Language Model (0.48)
- Text Processing (0.47)
- Data Science > Data Mining
- Information Technology