Game Solving with Online Fine-Tuning

Neural Information Processing Systems 

We basically follow the same PCN training method by Wu et al. The architecture of the PCN contains three residual blocks with 256 hidden channels. A total of 400,000 self-play games are generated for the whole training. During optimization, the learning rate is fixed at 0.02, and the batch size is set to 1,024. The PCN is optimized for 500 steps for every 2,000 self-play games.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found