A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Mar-21-2025, 20:46:52 GMT–Neural Information Processing Systems

Recent developments of vision large language models (LLMs) have seen remarkable progress, yet still encounter challenges towards multimodal generalists, such as coarse-grained instance-level understanding, lack of unified support for both images and videos, and insufficient coverage across various vision tasks.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Mar-21-2025, 20:46:52 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.14)

Genre:
- Research Report > Experimental Study (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)
  - Natural Language > Large Language Model (1.00)
  - Vision (1.00)