A Theoretical Analysis
–Neural Information Processing Systems
Note 3: Consider a balanced continual learning dataset (e.g., Split-CIFAR100, Split-Mini-ImageNet) Note 4: Consider general continual learning datasets. The hyperparameter settings are summarized in Table 4. All models are optimized using vanilla SGD. For all experiments, we use the learning rate of 0.1 following the same setting as in Aljundi et al. Mai et al. reported (2021) considerable and consistent performance gains when replacing the Softmax classifier with the NCM classifier.
Neural Information Processing Systems
Aug-15-2025, 05:14:18 GMT
- Technology: