Goto

Collaborating Authors

 Reinforcement Learning






PerfectDou: Dominating DouDizhu with Perfect Information Distillation Guan Y ang

Neural Information Processing Systems

As a challenging multi-player card game, DouDizhu has recently drawn much attention for analyzing competition and collaboration in imperfect-information games. In this paper, we propose PerfectDou, a state-of-the-art DouDizhu AI system that dominates the game, in an actor-critic framework with a proposed technique named perfect information distillation.



Supplementary Material for Rethinking Value Function Learning for Generalization in Reinforcement Learning A Stiffness Analysis

Neural Information Processing Systems

The green lines in Figure 1 demonstrate that the stiffness decreases as the number of training levels increases in most of the Procgen games. This suggests that the delayed critic update effectively alleviates the memorization problem. Each agent is trained on 200 training levels for 25M environment steps. Each agent is trained for 8M environment steps. The mean is computed over 10 different runs.



How to talk so AI will learn: Instructions, descriptions, and autonomy

Neural Information Processing Systems

From the earliest years of our lives, humans use language to express our beliefs and desires. Being able to talk to artificial agents about our preferences would thus fulfill a central goal of value alignment. Y et today, we lack computational models explaining such language use.