Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction

Open in new window