18093dfe68516361d5b6239d33e045b1-Paper-Datasets_and_Benchmarks_Track.pdf
–Neural Information Processing Systems
We introduce ITTO, a challenging new benchmark suite for evaluating and diagnosing the capabilities and limitations of point tracking methods. Our videos are sourced from existing datasets and egocentric real-world recordings, with highquality human annotations collected through a multi-stage pipeline. ITTO captures the motion complexity, occlusion patterns, and object diversity characteristic of real-world scenes - factors that are largely absent in current benchmarks. We conduct a rigorous analysis of state-of-the-art tracking methods on ITTO, breaking down performance along key axes of motion complexity. Our findings reveal that existing trackers struggle with these challenges, particularly in re-identifying points after occlusion, highlighting critical failure modes. These results point to the need for new modeling approaches tailored to real-world dynamics. We envision ITTO as a foundation testbed for advancing point tracking and guiding the development of more robust tracking algorithms.
Neural Information Processing Systems
Jun-23-2026, 05:42:58 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.87)
- Research Report
- Industry:
- Information Technology (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Natural Language (0.93)
- Representation & Reasoning (0.68)
- Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence