TokenLearner: Adaptive Space-Time Tokenization for Videos Michael S. Ryoo
–Neural Information Processing Systems
This results in efficiently and effectively finding a few important visual tokens and enables modeling of pairwise attention between such tokens, over a longer temporal horizon for videos, or the spatial content in image frames.
Neural Information Processing Systems
Aug-14-2025, 23:53:20 GMT
- Country:
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- Genre:
- Research Report (0.93)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Machine Learning > Neural Networks (1.00)
- Natural Language (0.68)
- Information Technology > Artificial Intelligence