SOAT: A Scene-and Object-Aware Transformer for Vision-and-Language Navigation

Open in new window