MineAny Build: Benchmarking Spatial Planning for Open-world AIAgents
–Neural Information Processing Systems
Spatial Planning is a crucial part in the field of spatial intelligence, which requires the understanding and planning about object arrangements in space perspective. AI agents with the spatial planning ability can better adapt to various real-world applications, including robotic manipulation, automatic assembly, urban planning etc. Recent works have attempted to construct benchmarks for evaluating the spatial intelligence of Multimodal Large Language Models (MLLMs). Nevertheless, these benchmarks primarily focus on spatial reasoning based on typical Visual QuestionAnswering (VQA) forms, which suffers from the gap between abstract spatial understanding and concrete task execution. In this work, we take a step further to build a comprehensive benchmark called MineAnyBuild, aiming to evaluate the spatial planning ability of open-world AI agents in the Minecraft game. Specifically, MineAnyBuild requires an agent to generate executable architecture building plans based on the given multi-modal human instructions.
Neural Information Processing Systems
Jun-19-2026, 01:36:10 GMT
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.36)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Natural Language > Large Language Model (1.00)
- Machine Learning > Neural Networks (0.95)
- Representation & Reasoning
- Agents (1.00)
- Spatial Reasoning (0.90)
- Information Technology > Artificial Intelligence