Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

Open in new window