Artemis: Towards Referential Understanding in Complex Videos

Oct-10-2025, 17:08:27 GMT–Neural Information Processing Systems

Videos carry rich visual information including object description, action, interaction, etc., but the existing multimodal large language models (MLLMs) fell short

artemis, arxiv preprint arxiv, video, (15 more...)

Neural Information Processing Systems

Oct-10-2025, 17:08:27 GMT

Conferences PDF

Country:
- Asia > China (0.04)
- Europe > Netherlands
  - North Holland > Amsterdam (0.04)

Genre:
- Research Report
  - Experimental Study (0.93)
  - New Finding (0.67)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning (1.00)
  - Representation & Reasoning (0.93)

Duplicate Docs Excel Report

Title
cf8ec6e5eb9b52bae998dc534713848d-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found