InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD Xiaoyi Dong

Feb-12-2026, 14:50:55 GMT–Neural Information Processing Systems

The Large Vision-Language Model (L VLM) field has seen significant advancements, yet its progression has been hindered by challenges in comprehending fine-grained visual content due to limited resolution.

large language model, machine learning, resolution, (20 more...)

Neural Information Processing Systems

Feb-12-2026, 14:50:55 GMT

Conferences PDF

Country:
- Europe
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - France > Bourgogne-Franche-Comté
    - Doubs > Besançon (0.04)
- Asia > China
  - Shanghai > Shanghai (0.04)
  - Hong Kong (0.04)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Education (0.46)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (0.93)
    - Machine Learning > Neural Networks
      - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD Xiaoyi Dong

Similar Docs Excel Report more

Title	Similarity	Source
None found