e1cf57f1e104c6c05e31894c15a65e99-Supplemental-Conference.pdf

Feb-12-2026, 10:36:33 GMT–Neural Information Processing Systems

Here we report both the median test win rate and mean episodic returnwiththe95%confidenceinterval. MAPPOMAA2C hiddendimension 128 128 learningrate 0.0003 0.0005 rewardstandardisation False True networktype MLP/GRU/ATMMLP/GRU/ATM entropycoefficient 0.001 0.01 targetupdate 0.01(soft) 0.01(soft) n-step 5 10 ATM is used as the individual policy network for agents and we here give the detailed network configurationsofATMinTable5. We provide the translation of agent 0's decision process in one battle on 5m_vs_6m as shown in Table7.

entity-boundactionlayernumber 1, networkconfiguration value selfembeddinglayernumber 1, valuelayerhiddendimension 64, (14 more...)

Neural Information Processing Systems

Feb-12-2026, 10:36:33 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology (0.38)

Duplicate Docs Excel Report

Title
e1cf57f1e104c6c05e31894c15a65e99-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found