A Experimental Protocol We selected hyperparameters using the four disjoint validation corruptions provided with CIFAR-10-C and ImageNet-C [ 12
–Neural Information Processing Systems
We considered the following hyperparameters when performing a grid search. Beyond learning rate and number of gradient steps, we also evaluated using a simple "threshold" by performing adaptation only when the marginal entropy was greater than ResNext-101 models without any additional tuning, except we use B = 32 due to memory limits. The TT A results are obtained using the same AugMix augmentations as for MEMO. We obtain the baseline ResNet-50 and ResNext-101 (32x8d) parameters directly from the torchvision library. One may wonder: are augmentations needed in the first place?
Neural Information Processing Systems
Nov-17-2025, 21:52:54 GMT
- Technology: