InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD Xiaoyi Dong
–Neural Information Processing Systems
The Large Vision-Language Model (L VLM) field has seen significant advancements, yet its progression has been hindered by challenges in comprehending fine-grained visual content due to limited resolution.
Neural Information Processing Systems
Feb-12-2026, 14:50:55 GMT
- Country:
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Education (0.46)
- Technology: