Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
Tan, Weihao, Li, Xiangyang, Fang, Yunhao, Yao, Heyuan, Yan, Shi, Luo, Hao, Ao, Tenglong, Li, Huihui, Ren, Hongbin, Yi, Bairen, Qin, Yujia, An, Bo, Liu, Libin, Shi, Guang
–arXiv.org Artificial Intelligence
We introduce Lumine, the first open recipe for developing generalist agents capable of completing hours-long complex missions in real time within challenging 3D open-world environments. Lumine adopts a human-like interaction paradigm that unifies perception, reasoning, and action in an end-to-end manner, powered by a vision-language model. It processes raw pixels at 5 Hz to produce precise 30 Hz keyboard-mouse actions and adaptively invokes reasoning only when necessary. Trained in Genshin Impact, Lumine successfully completes the entire five-hour Mondstadt main storyline on par with human-level efficiency and follows natural language instructions to perform a broad spectrum of tasks in both 3D open-world exploration and 2D GUI manipulation across collection, combat, puzzle-solving, and NPC interaction. In addition to its in-domain performance, Lumine demonstrates strong zero-shot cross-game generalization. Without any fine-tuning, it accomplishes 100-minute missions in Wuthering Waves and the full five-hour first chapter of Honkai: Star Rail. These promising results highlight Lumine's effectiveness across distinct worlds and interaction dynamics, marking a concrete step toward generalist agents in open-ended environments.
arXiv.org Artificial Intelligence
Nov-13-2025
- Country:
- Europe > Sweden > Skåne County > Malmö (0.04)
- Genre:
- Instructional Material > Course Syllabus & Notes (0.45)
- Research Report (1.00)
- Industry:
- Education (0.92)
- Information Technology > Software (0.93)
- Leisure & Entertainment > Games
- Computer Games (1.00)
- Technology:
- Information Technology
- Artificial Intelligence
- Cognitive Science (1.00)
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language > Large Language Model (1.00)
- Representation & Reasoning > Agents (1.00)
- Robots (1.00)
- Vision (1.00)
- Human Computer Interaction (1.00)
- Artificial Intelligence
- Information Technology