GameSolvingwithOnlineFine-Tuning

Feb-16-2026, 17:12:30 GMT–Neural Information Processing Systems

A.1 PCNtraining We basically follow the same PCN training method by Wu et al.[1] but replace the AlphaZero algorithm with the Gumbel AlphaZero algorithm [2], where the simulation count is set to 322 in self-play and starts by sampling 16 actions. The architecture of the PCN contains three residual blocks with 256 hidden channels. Atotal of400,000 self-play games are generated for the whole training. During optimization, the learning rate is fixed at 0.02, and the batch size is set to 1,024. A.3 Workerdesign The worker is itself a Killall-Go solver. Thus,tofullyutilize GPU resources, we implement batch GPU inferencing to accelerate PCN evaluations for workers.

artificial intelligence, avg, node, (16 more...)

Neural Information Processing Systems

Feb-16-2026, 17:12:30 GMT

Conferences PDF

Add feedback

Country:
- Asia > Taiwan (0.05)
- North America > Canada (0.04)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Duplicate Docs Excel Report

Title
b663eb1512ce6c268e3e56f34c6d2959-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found