setup
–Neural Information Processing Systems
The implementation of the following setup is written in JAX [6] and Haiku [35]. We use Residual Networks (ResNets) and Wide ResNets (WRNs) [31, 79]. This is consistent with prior work [30, 49, 60, 72, 82] which use diverse variants of these network families. Furthermore, we adopt the same architecture details as Gowal et al. [30] with Swish/SiLU [33] activation functions. Most of the experiments are conducted on a WRN-28-10 model which has a depth of 28, a width multiplier of 10 and contains 36M parameters. To evaluate the effect of using additional generated data on wider and deeper networks, we also run several experiments using WRN-70-16, which contains 267M parameters.
Neural Information Processing Systems
Apr-25-2026, 02:25:30 GMT