Enhancing Embodied Object Detection through Language-Image Pre-training and Implicit Object Memory

Open in new window