Goto

Collaborating Authors

 build order


SC-Phi2: A Fine-tuned Small Language Model for StarCraft II Macromanagement Tasks

Khan, Muhammad Junaid, Sukthankar, Gita

arXiv.org Artificial Intelligence

This paper introduces SC-Phi2, a fine-tuned StarCraft II small language model for macromanagement tasks. Small language models, like Phi2, Gemma, and DistilBERT, are streamlined versions of large language models (LLMs) with fewer parameters that require less power and memory to run. To teach Microsoft's Phi2 model about StarCraft, we create a new SC2 text dataset with information about StarCraft races, roles, and actions and use it to fine-tune Phi-2 with self-supervised learning. We pair this language model with a Vision Transformer (ViT) from the pre-trained BLIP-2 (Bootstrapping Language Image Pre-training) model, fine-tuning it on the MSC replay dataset. This enables us to construct dynamic prompts that include visual game state information. Unlike the large models used in StarCraft LLMs such as GPT-3.5, Phi2 is trained primarily on textbook data and contains little inherent knowledge of StarCraft II beyond what is provided by our training process. By using LoRA (Low-rank Adaptation) and quantization, our model can be trained on a single GPU. We demonstrate that our model performs well at micromanagement tasks such as build order and global state prediction with a small number of parameters.


High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

Gehring, Jonas, Ju, Da, Mella, Vegard, Gant, Daniel, Usunier, Nicolas, Synnaeve, Gabriel

arXiv.org Machine Learning

We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy. Here, a good strategy successfully counters the opponent's current and possible future strategies which can only be estimated using partial observations. We investigate whether we can utilize the full game state information during training time (in the form of an auxiliary prediction task) to increase performance. Experiments carried out within a StarCraft: Brood War bot against strong community bots show substantial win rate improvements over a fixed-strategy baseline and encouraging results when learning with the auxiliary task.


Modular Architecture for StarCraft II with Deep Reinforcement Learning

Lee, Dennis, Tang, Haoran, Zhang, Jeffrey O, Xu, Huazhe, Darrell, Trevor, Abbeel, Pieter

arXiv.org Artificial Intelligence

We present a novel modular architecture for StarCraft II AI. The architecture splits responsibilities between multiple modules that each control one aspect of the game, such as build-order selection or tactics. A centralized scheduler reviews macros suggested by all modules and decides their order of execution. An updater keeps track of environment changes and instantiates macros into series of executable actions. Modules in this framework can be optimized independently or jointly via human design, planning, or reinforcement learning. We apply deep reinforcement learning techniques to training two out of six modules of a modular agent with self-play, achieving 94% or 87% win rates against the "Harder" (level 5) built-in Blizzard bot in Zerg vs. Zerg matches, with or without fog-of-war.


Build Order Optimization in StarCraft

Churchill, David (University of Alberta) | Buro, Michael (University of Alberta)

AAAI Conferences

In recent years, real-time strategy (RTS) games have gained interest in the AI research community for their multitude of challenging subproblems — such as collaborative pathfinding, effective resource allocation and unit targeting, to name a few. In this paper we consider the build order problem in RTS games in which we need to find concurrent action sequences that, constrained by unit dependencies and resource availability, create a certain number of units and structures in the shortest possible time span. We present abstractions and heuristics that speed up the search for approximative solutions considerably in the game of StarCraft, and show the efficacy of our method by comparing its real-time performance with that of professional StarCraft players.