A More experiments
–Neural Information Processing Systems
A.1 More on setup Settings and hyperparameters We train MultiMix and Dense MultiMix with mixed examples only. We use a mini-batch of size b = 128 examples in all experiments. Following Manifold Mixup [51], for every mini-batch, we apply MultiMix with probability 0.5 or input mixup otherwise. For multi-GPU experiments, all training hyperparameters including m and n are per GPU. For Dense MultiMix, the spatial resolution is r =4 4 = 16 on CIFAR-10/100 and r =7 7 = 49 on Imagenet by default.
Neural Information Processing Systems
May-25-2025, 11:02:25 GMT
- Technology: