Goto

Collaborating Authors

 nsa-i


StochasticArchitectures

Neural Information Processing Systems

We take 1000 training images from CIFAR-10 as a fixed batch, randomly sample the neural architecture for inference, and computevar(µ) of the last BN layer of a NSA and a NSA-i trained givenS = 5000architectures. Inthissection, wecalculate thetestaccuracyof200randomly sampled architectures based onthe vanilla NSA models trained under various spaces. A half of these architectures are seen during trainingwhiletheotherhalfnot.



Supplementary Material for: Understanding and Exploring the Network with Stochastic Architectures

Neural Information Processing Systems

In this section, we plot the 5 randomly sampled architectures used in NSA-id in Sec. 5. Figure 1, the 5 architectures are distinct from each other. We provide more results for the training and test behaviour of vanilla NSA and NSA-i in this section. Figure 5: Five randomly sampled architectures used in NSA-id in Sec. 5. The training architecture space consists of 50000 samples. The training architecture space consists of 50000 samples.