Hyper-parameters Value Replay Buffer Parameters burn-in-frames 10000 replay buffer size 131072 (2

Neural Information Processing Systems 

In training every agent we use a distributed framework for simulation and training. We utilize epsilon exploration for training agent exploration. We train two distinct policies to test the ad-hoc teamplay performance of our agents.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found