TAP-Vid: ABenchmarkforTrackingAnyPointina Video
–Neural Information Processing Systems
Centraltotheconstruction of our benchmark is a novel semi-automatic crowdsourced pipeline which uses optical flow estimates to compensate for easier, short-term motion like camera shake,allowing annotators tofocus onharder sections ofvideo.
Neural Information Processing Systems
Feb-9-2026, 03:46:23 GMT