7183145a2a3e0ce2b68cd3735186b1d5-AuthorFeedback.pdf
–Neural Information Processing Systems
The temporal MoCo model thus tries to learn similar embeddings for neighboring12 frames. Intheexemplarsplit,weuse27ofthese16 exemplars totrain linear classifiers ontopofthepre-trained and frozen features. We will do a better job of23 motivating this important question intherevised paper. Wehaverevised the paper to acknowledge and cite this earlier work.(4)44 Mostvideo47 datasets incomputer vision consist ofrelatively short video clips instead (typically on the order oftens ofseconds);48 in this setting, the temporal classification model becomes similar to a video instance embedding model, which has49 beenexploredbefore(cf.
Neural Information Processing Systems
Feb-8-2026, 21:30:04 GMT
- Technology: