PhysVLM-AVR: Active Visual Reasoning for Multimodal Large Language Models in Physical Environments

Open in new window