Spatial-ViLT: Enhancing Visual Spatial Reasoning through Multi-Task Learning