Supplementary Material for: Understanding and Exploring the Network with Stochastic Architectures

Neural Information Processing Systems 

In this section, we plot the 5 randomly sampled architectures used in NSA-id in Sec. 5. Figure 1, the 5 architectures are distinct from each other. We provide more results for the training and test behaviour of vanilla NSA and NSA-i in this section. Figure 5: Five randomly sampled architectures used in NSA-id in Sec. 5. The training architecture space consists of 50000 samples. The training architecture space consists of 50000 samples.