Improving Pre-Trained Vision-Language-Action Policies with Model-Based Search

Neary, Cyrus, Younis, Omar G., Kuramshin, Artur, Aslan, Ozgur, Berseth, Glen

Nov-14-2025–arXiv.org Artificial Intelligence

Pre-trained vision-language-action (VLA) models offer a promising foundation for generalist robot policies, but often produce brittle behaviors or unsafe failures when deployed zero-shot in out-of-distribution scenarios. We present Vision-Language-Action Planning & Search (VLAPS) -- a novel framework and accompanying algorithms that embed model-based search into the inference procedure of pre-trained VLA policies to improve their performance on robotic tasks. Specifically, our method biases a modified Monte Carlo Tree Search (MCTS) algorithm -- run using a model of the target environment -- using action priors defined by the VLA policy. By using VLA-derived abstractions and priors in model-based search, VLAPS efficiently explores language-conditioned robotics tasks whose search spaces would otherwise be intractably large. Conversely, by integrating model-based search with the VLA policy's inference procedure, VLAPS yields behaviors that are more performant than those obtained by directly following the VLA policy's action predictions. VLAPS offers a principled framework to: i) control test-time compute in VLA models, ii) leverage a priori knowledge of the robotic environment, and iii) integrate established planning and reinforcement learning techniques into the VLA inference process. Across all experiments, VLAPS significantly outperforms VLA-only baselines on language-specified tasks that would otherwise be intractable for uninformed search algorithms, increasing success rates by as much as 67 percentage points.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

Nov-14-2025

arXiv.org PDF

Add feedback

Country:
- North America > Canada (0.28)

Genre:
- Research Report (1.00)
- Overview (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Machine Learning > Reinforcement Learning (1.00)
  - Cognitive Science > Problem Solving (0.91)
  - Natural Language > Large Language Model (0.88)
  - Representation & Reasoning
    - Search (1.00)
    - Planning & Scheduling (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found