Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction