ChatBCG: Can AI Read Your Slide Deck?
Singh, Nikita, Balian, Rob, Martinelli, Lukas
–arXiv.org Artificial Intelligence
With the advanced vision capabilities of GPT-4o and Gemini Flash, an important question arises regarding the accuracy of these functionalities in practical business applications. Our assumption was that multimodal models are good at reading and summarizing charts. When given an image of a slide deck, they do a good job of summarizing key insights from it, often including relevant data points. Existing research into this question has evaluated the efficacy of LLM's when parsing tables [3], concluding that the LLMs were highly sensitive to input prompts which drive performance. Other works also evaluate LLMs ability to reason and read mathematical graphs [2] and find that GPT models outperform alternatives. This paper aims to explore whether multimodal models perform well on a variant of this skill - answering straightforward questions that require the models to pick out a number from a slide deck.
arXiv.org Artificial Intelligence
Jul-16-2024
- Country:
- Africa (0.04)
- Asia
- India (0.04)
- Japan (0.04)
- Middle East > Israel (0.04)
- Europe > France (0.04)
- North America > United States (0.04)
- South America > Brazil (0.04)
- Genre:
- Questionnaire & Opinion Survey (0.70)
- Research Report (0.83)
- Industry:
- Technology: