How Vision-Language Tasks Benefit from Large Pre-trained Models: A Survey

Open in new window