Using Vision Language Models as Closed-Loop Symbolic Planners for Robotic Applications: A Control-Theoretic Perspective
Wang, Hao, Karnik, Sathwik, Lim, Bea, Bansal, Somil
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) and Vision Language Models (VLMs) have been widely used for embodied symbolic planning. Y et, how to effectively use these models for closed-loop symbolic planning remains largely unexplored. Because they operate as black boxes, LLMs and VLMs can produce unpredictable or costly errors, making their use in high-level robotic planning especially challenging. In this work, we investigate how to use VLMs as closed-loop symbolic planners for robotic applications from a control-theoretic perspective. Concretely, we study how the control horizon and warm-starting impact the performance of VLM symbolic planners. We design and conduct controlled experiments to gain insights that are broadly applicable to utilizing VLMs as closed-loop symbolic planners, and we discuss recommendations that can help improve the performance of VLM symbolic planners. The project website can be found here.
arXiv.org Artificial Intelligence
Nov-11-2025
- Country:
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Genre:
- Research Report
- Experimental Study (0.54)
- Strength High (0.54)
- Research Report
- Technology: