MomentDiff: Generative Video Moment Retrieval from Random to Real (Supplementary Material)
–Neural Information Processing Systems
Each video is annotated with an average of 2.4 moments, with The dataset contains a total of 10,310 queries with 18,367 annotated moments. Then, we design the dataset Charades-ST A-Mom based on the span's end time Algorithm 1 provides the pseudo-code of MomentDiff Training in a PyTorch-like style. Inference efficiency is critical for machine learning models. We report R1@0.5, R1@0.7 and MAP Figure 1 shows the performance fluctuation of the model on the Charades-ST A dataset. Glove; SF+C, C;) to organize experiments. Therefore we adopt DDIM as the default technology.
Neural Information Processing Systems
Oct-9-2025, 08:02:47 GMT
- Technology: