Appendix of Prophet Attention
–Neural Information Processing Systems
CIDEr-c40, which is the default ranking score in the leaderboard, and rank the 1st. Compared with image captioning, the target of video captioning is the video clip, i.e., an ordered The dataset contain 10,000 video clips, and each video is paired with 20 annotated sentences. We use the official splits to report our results. CIDEr, which is built upon on n-gram matching, is used in our tests for performance evaluation. All re-implementations and our experiments were ran on V100 GPUs.
Neural Information Processing Systems
Oct-2-2025, 04:30:34 GMT
- Country:
- Asia > China
- Beijing > Beijing (0.04)
- Guangdong Province > Shenzhen (0.05)
- North America > Canada (0.04)
- Asia > China
- Genre:
- Research Report > New Finding (0.35)
- Technology: