Game Solving with Online Fine-Tuning
–Neural Information Processing Systems
We basically follow the same PCN training method by Wu et al. The architecture of the PCN contains three residual blocks with 256 hidden channels. A total of 400,000 self-play games are generated for the whole training. During optimization, the learning rate is fixed at 0.02, and the batch size is set to 1,024. The PCN is optimized for 500 steps for every 2,000 self-play games.
Neural Information Processing Systems
Nov-19-2025, 17:47:16 GMT