VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation

Open in new window