A Experimental details
–Neural Information Processing Systems
Generally it looks like transferring more layers leads to stronger positive effects. All ResNet18 runs performed on ImageNet (and most of the ones on CIFAR10) result in positive transfer (in terms of the downstream training accuracy), which may be caused by the fact that we do not re-scale them after pre-training. Most likely there are other important factors, yet to be discovered, playing a role in all these experiments. However, the evidence presented earlier (Figures 2 and 3, Table 1) suggests that alignment is responsible for the observed positive transfer in at least some of these cases. B.6 Detailed results for some of the experiments Figure 16 reports experiments with ResNet18 on ImageNet where pre-training helps.
Neural Information Processing Systems
Feb-7-2025, 11:28:19 GMT