VHELM: A Holistic Evaluation of Vision Language Models Tony Lee 1 Haoqin T u 2 Chi Heem Wong

Oct-10-2025, 22:31:06 GMT–Neural Information Processing Systems

Our framework is designed to be lightweight and automatic so that evaluation runs are cheap and fast. Our initial run evaluates 22 VLMs on 21 existing datasets to provide a holistic snapshot of the models. We uncover new key findings, such as the fact that efficiency-focused models (e.g., Claude 3 Haiku or Gemini 1.5 Flash) perform significantly

benchmark, claude 3, gemini 1, (16 more...)

Neural Information Processing Systems

Oct-10-2025, 22:31:06 GMT

Conferences PDF

Add feedback

Country:
- Asia > Japan (0.04)
- South America > Peru
  - Cusco Department > Cusco Province > Cusco (0.04)
- North America
  - Montserrat (0.04)
  - United States
    - North Carolina > Orange County
      - Chapel Hill (0.04)
    - California
      - Santa Clara County > Palo Alto (0.04)
      - Santa Cruz County > Santa Cruz (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Health & Medicine (1.00)
- Law (0.67)
- Education > Educational Setting (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.55)

Duplicate Docs Excel Report

Title
fe2fc7dc60b55ccd8886220b40fb1f74-Paper-Datasets_and_Benchmarks_Track.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found