VHELM: A Holistic Evaluation of Vision Language Models
–Neural Information Processing Systems
Current benchmarks for assessing vision-language models (VLMs) often focus on their perception or problem-solving capabilities and neglect other critical aspects such as fairness, multilinguality, or toxicity. Furthermore, they differ in their evaluation procedures and the scope of the evaluation, making it difficult to compare models. To address these issues, we extend the HELM framework to VLMs to present the Holistic Evaluation of Vision Language Models (VHELM). VHELM aggregates various datasets to cover one or more of the 9 aspects:,,,,,,,, and . In doing so, we produce a comprehensive, multi-dimensional view of the capabilities of the VLMs across these important factors.
Neural Information Processing Systems
Dec-27-2025, 14:37:41 GMT
- Country:
- North America > United States > California > Santa Clara County > Palo Alto (0.07)
- Technology: