Benchmarking Long-context Document Understanding with Visualizations
–Neural Information Processing Systems
Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem.
Neural Information Processing Systems
Mar-26-2025, 22:24:18 GMT
- Country:
- Asia > Middle East
- UAE (0.14)
- Europe (0.67)
- North America > United States
- Illinois (0.14)
- Asia > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Education > Educational Setting (0.45)
- Health & Medicine (0.67)
- Information Technology (0.93)
- Law (0.68)
- Leisure & Entertainment (0.93)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning > Neural Networks
- Deep Learning (0.72)
- Natural Language
- Chatbot (0.72)
- Large Language Model (1.00)
- Vision (0.88)
- Machine Learning > Neural Networks
- Communications > Social Media (0.92)
- Artificial Intelligence
- Information Technology