MSG-Chart: Multimodal Scene Graph for ChartQA

Aug-9-2024–arXiv.org Artificial Intelligence

Automatic Chart Question Answering (ChartQA) is challenging due to the complex distribution of chart elements with patterns of the underlying data not explicitly displayed in charts. To address this challenge, we design a joint multimodal scene graph for charts to explicitly represent the relationships between chart elements and their patterns. Our proposed multimodal scene graph includes a visual graph and a textual graph to jointly capture the structural and semantical knowledge from the chart. This graph Figure 1: Cutting-Edge LLMs and Our MSG-Chart module can be easily integrated with different vision transformers as inductive bias. Our experiments demonstrate that incorporating charts often include extensive text and numerical data, understanding the proposed graph module enhances the understanding of charts' these features is crucial for accurate question answering. While elements' structure and semantics, thereby improving performance recognizing the underlying text of an object is enough for data extraction on publicly available benchmarks, ChartQA and OpenCQA.

chartqa, computational linguistic, graph, (12 more...)

arXiv.org Artificial Intelligence

Aug-9-2024

arXiv.org PDF

Add feedback

Country:
- South America > Colombia
  - Meta Department > Villavicencio (0.04)
- Oceania
  - Solomon Islands (0.04)
  - Australia
    - Western Australia > Perth (0.14)
    - Victoria > Melbourne (0.04)
- North America
  - United States
    - District of Columbia > Washington (0.04)
    - New York > New York County
      - New York City (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Idaho > Ada County
      - Boise (0.05)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
  - Canada
    - Ontario > Toronto (0.05)
    - Quebec > Montreal (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe
  - Italy (0.04)
  - Greece (0.04)
  - France (0.04)
  - Belgium (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Singapore (0.04)
  - Indonesia > Bali (0.04)
  - Middle East
    - Jordan (0.05)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.04)
    - Israel > Tel Aviv District
      - Tel Aviv (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Text Mining (0.34)
  - Artificial Intelligence
    - Vision (1.00)
    - Machine Learning > Neural Networks (0.46)
    - Natural Language
      - Question Answering (0.55)
      - Large Language Model (0.48)
      - Text Processing (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found