A Appendix 1 A.1 Toy problem
–Neural Information Processing Systems
Gaussian noise samples, and the embeddings in each of the three cases, respectively from left to right. Figure 1: (Left) Circles data on which the MLP is trained. Neurons in that case either explode or rarely activate. Figure 2: Unclipped scatter plot (linked to Figure 3 (Right) of the paper) and accompanying distribution plot for'avgpool' layer of the student network trained for CIFAR10 using our approach. Figure 3: (a) V arying number of batches to adjust student's running statistics.
Neural Information Processing Systems
Oct-2-2025, 21:13:56 GMT