Learning Semantic-Agnostic and Spatial-Aware Representation for Generalizable Visual-Audio Navigation

Open in new window