Detecting Moments and Highlights in Videos via Natural Language Queries
–Neural Information Processing Systems
Each video in the dataset is annotated with: (1) a human-written free-form NL query, (2) relevant moments in the video w.r.t. the query, and (3) five-point scale saliency scores for all query-relevant clips.
Neural Information Processing Systems
Aug-14-2025, 20:18:40 GMT