e8258e5140317ff36c7f8225a3bf9590-Supplemental.pdf

Feb-11-2026, 16:41:22 GMT–Neural Information Processing Systems

The original MuZero did not use sticky actions (Machado et al., 2017) (a 25% chance that the selected action is ignored and that instead the previous action is repeated) for Atari experiments. For all experiments in this work we used a network architecture based on the one introduced by MuZero(Schrittwieser etal.,2020), To implement the network, we used the modules provided by the Haiku neural network library (Henniganetal.,2020). We did not observe any benefit from using a Gaussian mixture, so instead inallourexperiments weusedasingle Gaussian withdiagonal covariance. All experiments used the Adam optimiser (Kingma & Ba, 2015) with decoupled weight decay (Loshchilov & Hutter, 2017) for training.

artificial intelligence, config, machine learning, (18 more...)

Neural Information Processing Systems

Feb-11-2026, 16:41:22 GMT

Conferences PDF

Add feedback

Industry:
- Leisure & Entertainment > Games (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
e8258e5140317ff36c7f8225a3bf9590-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found