Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment

Open in new window