Agents
c97e7a5153badb6576d8939469f58336-Supplemental.pdf
Our initial experiments (implementation, debugging, hyperparameter tuning, etc.) required about 5000CPUhoursofcompute. Due to these rules, it is recommended to group together in order to attack simultaneously. In Warehouse[4], QTRAN makes slightly faster progress than VAST(ฮท = 12). The results forWarehouse[16], Battle[80], and GaussianSqueeze[800] are shown in Figure 1. Figure 10: Visualizations of the generated sub-teams ofXMetaGrad with ฮท = 14 and XSpatial with k-means clustering using 10 centroids at different stages (early, middle, late) inBattle[80] after training. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments.