Efficient Pre-training for Localized Instruction Generation of Videos

Open in new window