protoss
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach Weiyu Ma
With the continued advancement of Large Language Models (LLMs) Agents in reasoning, planning, and decision-making, benchmarks have become crucial in evaluating these skills. However, there is a notable gap in benchmarks for real-time strategic decision-making. StarCraft II (SC2), with its complex and dynamic nature, serves as an ideal setting for such evaluations. To this end, we have developed TextStarCraft II, a specialized environment for assessing LLMs in real-time strategic scenarios within SC2. Addressing the limitations of traditional Chain of Thought (CoT) methods, we introduce the Chain of Summarization (CoS) method, enhancing LLMs' capabilities in rapid and effective decision-making. Our key experiments included: 1. LLM Evaluation: Tested 10 LLMs in TextStarCraft II, most of them defeating L V5 build-in AI, showcasing effective strategy skills.
- Asia > South Korea (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.67)
- Leisure & Entertainment > Sports (1.00)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Government > Military (1.00)
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach Weiyu Ma
With the continued advancement of Large Language Models (LLMs) Agents in reasoning, planning, and decision-making, benchmarks have become crucial in evaluating these skills. However, there is a notable gap in benchmarks for real-time strategic decision-making. StarCraft II (SC2), with its complex and dynamic nature, serves as an ideal setting for such evaluations. To this end, we have developed TextStarCraft II, a specialized environment for assessing LLMs in real-time strategic scenarios within SC2. Addressing the limitations of traditional Chain of Thought (CoT) methods, we introduce the Chain of Summarization (CoS) method, enhancing LLMs' capabilities in rapid and effective decision-making. Our key experiments included: 1. LLM Evaluation: Tested 10 LLMs in TextStarCraft II, most of them defeating L V5 build-in AI, showcasing effective strategy skills.
- Asia > South Korea (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.67)
- Leisure & Entertainment > Sports (1.00)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Government > Military (1.00)
Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning
Ahn, Daechul, Kim, San, Choi, Jonghyun
Large Language Models (LLMs) have recently demonstrated impressive action sequence prediction capabilities but often struggle with dynamic, long-horizon tasks such as real-time strategic games. In a game such as StarCraftII (SC2), agents need to manage resource constraints and adapt to evolving battlefield situations in a partially observable environment. This often overwhelms exisiting LLM-based approaches. To address these challenges, we propose a hierarchical multi-agent framework that employs specialized imitation learning agents under a meta-controller called Strategic Planner (SP). By expert demonstrations, each specialized agent learns a distinctive strategy, such as aerial support or defensive maneuvers, and produces coherent, structured multistep action sequences. The SP then orchestrates these proposals into a single, environmentally adaptive plan that ensures local decisions aligning with long-term strategies. We call this HIMA (Hierarchical Imitation Multi-Agent). We also present TEXTSCII-ALL, a comprehensive SC2 testbed that encompasses all race match combinations in SC2. Our empirical results show that HIMA outperforms state of the arts in strategic clarity, adaptability, and computational efficiency, underscoring the potential of combining specialized imitation modules with meta-level orchestration to develop more robust, general-purpose AI agents.
- North America > United States (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Leisure & Entertainment > Sports (0.93)
- Government > Military (0.67)
Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach
Ma, Weiyu, Mi, Qirui, Yan, Xue, Wu, Yuqiao, Lin, Runji, Zhang, Haifeng, Wang, Jun
StarCraft II is a challenging benchmark for AI agents due to the necessity of both precise micro level operations and strategic macro awareness. Previous works, such as Alphastar and SCC, achieve impressive performance on tackling StarCraft II , however, still exhibit deficiencies in long term strategic planning and strategy interpretability. Emerging large language model (LLM) agents, such as Voyage and MetaGPT, presents the immense potential in solving intricate tasks. Motivated by this, we aim to validate the capabilities of LLMs on StarCraft II, a highly complex RTS game.To conveniently take full advantage of LLMs` reasoning abilities, we first develop textual StratCraft II environment, called TextStarCraft II, which LLM agent can interact. Secondly, we propose a Chain of Summarization method, including single frame summarization for processing raw observations and multi frame summarization for analyzing game information, providing command recommendations, and generating strategic decisions. Our experiment consists of two parts: first, an evaluation by human experts, which includes assessing the LLMs`s mastery of StarCraft II knowledge and the performance of LLM agents in the game; second, the in game performance of LLM agents, encompassing aspects like win rate and the impact of Chain of Summarization.Experiment results demonstrate that: 1. LLMs possess the relevant knowledge and complex planning abilities needed to address StarCraft II scenarios; 2. Human experts consider the performance of LLM agents to be close to that of an average player who has played StarCraft II for eight years; 3. LLM agents are capable of defeating the built in AI at the Harder(Lv5) difficulty level. We have open sourced the code and released demo videos of LLM agent playing StarCraft II.
- North America > United States (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > South Korea (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
On Efficient Reinforcement Learning for Full-length Game of StarCraft II
Liu, Ruo-Ze (Nanjing University) | Pang, Zhen-Jia | Meng, Zhou-Yu | Wang, Wenhai | Yu, Yang | Lu, Tong
StarCraft II (SC2) poses a grand challenge for reinforcement learning (RL), of which the main difficulties include huge state space, varying action space, and a long time horizon. In this work, we investigate a set of RL techniques for the full-length game of StarCraft II. We investigate a hierarchical RL approach, where the hierarchy involves two. One is the extracted macro-actions from experts’ demonstration trajectories to reduce the action space in an order of magnitude. The other is a hierarchical architecture of neural networks, which is modular and facilitates scale. We investigate a curriculum transfer training procedure that trains the agent from the simplest level to the hardest level. We train the agent on a single machine with 4 GPUs and 48 CPU threads. On a 64x64 map and using restrictive units, we achieve a win rate of 99% against the difficulty level-1 built-in AI. Through the curriculum transfer learning algorithm and a mixture of combat models, we achieve a 93% win rate against the most difficult non-cheating level built-in AI (level-7). In this extended version of the paper, we improve our architecture to train the agent against the most difficult cheating level AIs (level-8, level-9, and level-10). We also test our method on different maps to evaluate the extensibility of our approach. By a final 3-layer hierarchical architecture and applying significant tricks to train SC2 agents, we increase the win rate against the level-8, level-9, and level-10 to 96%, 97%, and 94%, respectively. Our codes and models are all open-sourced now at https://github.com/liuruoze/HierNet-SC2. To provide a baseline referring the AlphaStar for our work as well as the research and open-source community, we reproduce a scaled-down version of it, mini-AlphaStar (mAS). The latest version of mAS is 1.07, which can be trained using supervised learning and reinforcement learning on the raw action space which has 564 actions. It is designed to run training on a single common machine, by making the hyper-parameters adjustable and some settings simplified. We then can compare our work with mAS using the same computing resources and training time. By experiment results, we show that our method is more effective when using limited resources. The inference and training codes of mini-AlphaStar are all open-sourced at https://github.com/liuruoze/mini-AlphaStar. We hope our study could shed some light on the future research of efficient reinforcement learning on SC2 and other large-scale games.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (12 more...)
- Workflow (0.68)
- Research Report (0.67)
On Efficient Reinforcement Learning for Full-length Game of StarCraft II
Liu, Ruo-Ze, Pang, Zhen-Jia, Meng, Zhou-Yu, Wang, Wenhai, Yu, Yang, Lu, Tong
StarCraft II (SC2) poses a grand challenge for reinforcement learning (RL), of which the main difficulties include huge state space, varying action space, and a long time horizon. In this work, we investigate a set of RL techniques for the full-length game of StarCraft II. We investigate a hierarchical RL approach involving extracted macro-actions and a hierarchical architecture of neural networks. We investigate a curriculum transfer training procedure and train the agent on a single machine with 4 GPUs and 48 CPU threads. On a 64x64 map and using restrictive units, we achieve a win rate of 99% against the level-1 built-in AI. Through the curriculum transfer learning algorithm and a mixture of combat models, we achieve a 93% win rate against the most difficult non-cheating level built-in AI (level-7). In this extended version of the paper, we improve our architecture to train the agent against the cheating level AIs and achieve the win rate against the level-8, level-9, and level-10 AIs as 96%, 97%, and 94%, respectively. Our codes are at https://github.com/liuruoze/HierNet-SC2. To provide a baseline referring the AlphaStar for our work as well as the research and open-source community, we reproduce a scaled-down version of it, mini-AlphaStar (mAS). The latest version of mAS is 1.07, which can be trained on the raw action space which has 564 actions. It is designed to run training on a single common machine, by making the hyper-parameters adjustable. We then compare our work with mAS using the same resources and show that our method is more effective. The codes of mini-AlphaStar are at https://github.com/liuruoze/mini-AlphaStar. We hope our study could shed some light on the future research of efficient reinforcement learning on SC2 and other large-scale games.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (12 more...)
- Research Report (1.00)
- Workflow (0.68)
Rethinking of AlphaStar
We present a different view for AlphaStar (AS), the program achieving Grand-Master level in the game StarCraft II. It is considered big progress for AI research. However, in this paper, we present problems with the AS, some of which are the defects of it, and some of which are important details that are neglected in its article. These problems arise two questions. One is that what can we get from the built of AS? The other is that does the battle between it with humans fair? After the discussion, we present the future research directions for these problems. Our study is based on a reproduction code of the AS, and the codes are available online.
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology (0.92)
Google's StarCraft-playing AI is crushing pro gamers
In December, AlphaStar played as a Protoss and won five games against Dario Wünsch, a German player who goes by the gamer handle TLO and who also played as a Protoss (although it is not the group in which he specializes). A week later, the AI won five games again, this time against a tougher Protoss competitor: Grzegorz Komincz, a professional gamer from Poland who goes by the name MaNa. DeepMind announced the victories Thursday during a live stream on YouTube and Twitch. The researchers used a sort of tournament-style approach to train AlphaStar. First, they spent three days training a neural network -- a machine-learning algorithm modeled after the way neurons work in a brain -- on replays of human players' StarCraft II games. This neural network was used to create a number of computer-based competitors that played many, many rounds of the game against each other, learning from their experiences, over the course of two weeks.
AI Dominates Human Professional Players in StarCraft II
An artificial intelligence has defeated two top-ranked human players in the computer game StarCraft II, using some strategies rarely encountered before. On Thursday, gamers were able to watch the AI agent, called AlphaStar, expertly command armies of "Protoss" units against the professional players. The result: The AI beat the humans 10 out of the 11 matches. "I was surprised by how strong the agent was," said Dario "TLO" Wünsch, one of the human players. "AlphaStar takes well-known strategies and turns them on their head."
StarCraft II-playing AI AlphaStar takes out pros undefeated
Losing to the computer in StarCraft has been a tradition of mine since the first game came out in 1998. Of course, the built-in "AI" is trivial for serious players to beat, and for years researchers have attempted to replicate human strategy and skill in the latest version of the game. They've just made a huge leap with AlphaStar, which recently beat two leading pros 5-0. The new system was created by DeepMind, and in many ways it's very unlike what you might call a "traditional" StarCraft AI. The computer opponents you can select in the game are really pretty dumb -- they have basic built-in strategies, and know in general how to attack and defend and how to progress down the tech tree.