4547dff5fd7604f18c8ee32cf3da41d7-Supplemental.pdf

Neural Information Processing Systems 

In training every agent we use a distributed framework for simulation and training. For simulation, we run 6400 Hanabi environments in parallel and the trajectories are batched together for efficient GPU computation. This is done efficiently as every thread can hold many environments in which many agents interact. Every agent chooses actions based on neural network calls, which are more intensive and done by GPUs. By doing these calls asynchronously it allows a thread to support multiple environments while waiting for prior agents' actions to be computed.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found