Grounding Spatio-Temporal Language with Transformers

Open in new window