Empower Words: DualGround for Structured Phrase and Sentence-Level Temporal Grounding
–Neural Information Processing Systems
Video Temporal Grounding (VTG) aims to localize temporal segments in long, untrimmed videos that align with a given natural language query.
Neural Information Processing Systems
Jun-13-2026, 02:51:50 GMT
- Technology: