VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding Houlun Chen

Feb-12-2026, 07:33:12 GMT–Neural Information Processing Systems

Specifically, we resort to large language models (LLM) and large multimodal models (LMM) with our proposed Statics and Dynamics Enhanced Captioning modules to generate diverse fine-grained captions for each video.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Feb-12-2026, 07:33:12 GMT

Conferences PDF

Add feedback

Country:
- Asia > China > Beijing > Beijing (0.05)

Genre:
- Research Report (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
477929b8d45ab759795b7aac94329b08-Paper-Datasets_and_Benchmarks_Track.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found