StochasticArchitectures
–Neural Information Processing Systems
We take 1000 training images from CIFAR-10 as a fixed batch, randomly sample the neural architecture for inference, and computevar(µ) of the last BN layer of a NSA and a NSA-i trained givenS = 5000architectures. Inthissection, wecalculate thetestaccuracyof200randomly sampled architectures based onthe vanilla NSA models trained under various spaces. A half of these architectures are seen during trainingwhiletheotherhalfnot.
Neural Information Processing Systems
Feb-9-2026, 18:35:21 GMT