Captioning Visualizations with Large Language Models (CVLLM): A Tutorial
Carenini, Giuseppe, Johnson, Jordon, Salamatian, Ali
–arXiv.org Artificial Intelligence
It is well-established that visualizations have advantages over text-based representations for a number of analysis tasks, since they more fully leverage our innate visual processing capabilities. However, it has also been found that visualizations can be well-supported by textual augmentations such as captions [1]. Further, recent advances in large language models (LLMs) have resulted in their incorporation into an unprecedented number of applications and domains. That being the case, this tutorial aims to provide: (1) an overview of captioning visualizations and key concepts in Information Visualization (InfoVis), (2) an introduction to neural networks and transformers, (3) an exploration of the limitations of LLMs and recent developments in the field, and (4) the latest research on InfoVis captioning using LLMs and Large Vision-Language Models (LVLMs). We will begin with an overview of key concepts in InfoVis and captioning visualizations, including marks, channels, and content characterization.
arXiv.org Artificial Intelligence
Jun-27-2024
- Country:
- Asia > Singapore (0.05)
- North America
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Italy > Tuscany
- Florence (0.04)
- United Kingdom > England
- Genre:
- Technology: