Meta Internal Learning: Supplementary material Raphael Bensadoun
–Neural Information Processing Systems
Next, we would like to prove the opposite direction. All LeakyReLU activations have a slope of 0.02 for negative values except when we use a classic discriminator for single image training, for which we use a slope of 0.2. Additionally, the generator's last conv-block activation at each scale is Tanh instead of ReLU and the discriminator's last We clip the gradient s.t it has a maximal L2 norm of 1 for both the generators and Batch sizes of 16 were used for all experiments involving a dataset of images. At test time, the GPU memory usage is significantly reduced and requires 5GB. In this section, we consider training our method with a "frozen" pretrained ResNet34 i.e., optimizing If the problem could be learned with a "small enough" depth, our method would benefit from even As can be seen, our method yields realistic results with any batch size.
Neural Information Processing Systems
Feb-10-2026, 15:15:17 GMT