MomentDiff: Generative Video Moment Retrieval from Random to Real
–Neural Information Processing Systems
To achieve this goal, we provide a generative diffusion-based framework called MomentDiff, which simulates a typical human retrieval process from random browsing to gradual localization. Specifically, we first diffuse the real span to random noise, and learn to denoise the random noise to the original span with the guidance of similarity between text and video.
Neural Information Processing Systems
Feb-17-2026, 05:39:39 GMT
- Country:
- Asia > China > Anhui Province > Hefei (0.04)
- Genre:
- Research Report (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (1.00)
- Natural Language (0.69)
- Representation & Reasoning (0.68)
- Vision (1.00)
- Information Technology > Artificial Intelligence