Online Robot Navigation and Manipulation with Distilled Vision-Language Models