Spatial-ViLT: Enhancing Visual Spatial Reasoning through Multi-Task Learning

Open in new window