Artemis: Towards Referential Understanding in Complex Videos
–Neural Information Processing Systems
Videos carry rich visual information including object description, action, interaction, etc., but the existing multimodal large language models (MLLMs) fell short
Neural Information Processing Systems
Oct-10-2025, 17:08:27 GMT
- Country:
- Asia > China (0.04)
- Europe > Netherlands
- North Holland > Amsterdam (0.04)
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (0.67)
- Research Report
- Industry:
- Information Technology (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Natural Language > Large Language Model (1.00)
- Machine Learning (1.00)
- Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence