UniBench: VisualReasoningRequiresRethinking Vision-LanguageBeyondScaling
–Neural Information Processing Systems
Wefind that while scaling training data ormodel size can boost many vision-language model capabilities, scaling offers little benefit for reasoning or relations. Surprisingly, we also discover today's best VLMs struggle on simple digit recognition and counting tasks, e.g. MNIST, which much simpler networks can solve.
Neural Information Processing Systems
Feb-16-2026, 19:08:48 GMT
- Country:
- Europe > Spain
- Andalusia > Granada Province > Granada (0.04)
- North America > United States
- Georgia > Fulton County > Atlanta (0.04)
- Europe > Spain
- Genre:
- Research Report (0.31)
- Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Natural Language > Large Language Model (0.46)
- Vision (1.00)
- Information Technology > Artificial Intelligence