Token Bottleneck: One Token to Remember Dynamics
–Neural Information Processing Systems
Deriving compact and temporally aware visual representations from dynamic scenes is essential for successful execution of sequential scene understanding tasks such as visual tracking and robotic manipulation. In this paper, we introduce Token Bottleneck (ToBo), a simple yet intuitive self-supervised learning pipeline that squeezes a scene into a bottleneck token and predicts the subsequent scene using minimal patches as hints.
Neural Information Processing Systems
Jun-20-2026, 10:41:31 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.68)
- Research Report
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Robots (1.00)
- Natural Language (1.00)
- Representation & Reasoning (0.93)
- Machine Learning
- Neural Networks (0.93)
- Reinforcement Learning (0.68)
- Information Technology > Artificial Intelligence