Large-Scale Adversarial Training for Vision-and-Language Representation Learning: Supplementary Material

Open in new window