A Ablations

Neural Information Processing Systems 

We find that past play greatly stabilizes the emergence of reciprocity in IPD. In cells containing another agent, we include the RUSP observations in these channels. In Figure 11 we show results when training with RUSP in these environments. Consistent with past work, the greedy baseline fails to reach a solution with high collective return. We use a distributed computing infrastructure used in Berner et al.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found