Grounded ReinforcementLearning: LearningtoWintheGameunderHumanCommands SupplementaryMaterials
–Neural Information Processing Systems
Inthis section, we describe the details ofMiniRTSEnvironment and human dataset. The data do not contain any personally identifiable information or offensivecontent. Figure 1: MiniRTS [2]implements the rockpaper-scissors attack graph, each army type has some units it is effective against and vulnerableto. "swordman","spearman"and"cavalry"allare effectiveagainst"archer" Figure 2: Building units can produce different army units using resources. Resource Units: Resource units are stationary and neutral.
Neural Information Processing Systems
Feb-8-2026, 05:06:43 GMT