A Theoretical Analysis

Neural Information Processing Systems 

Note 3: Consider a balanced continual learning dataset (e.g., Split-CIFAR100, Split-Mini-ImageNet) Note 4: Consider general continual learning datasets. The hyperparameter settings are summarized in Table 4. All models are optimized using vanilla SGD. For all experiments, we use the learning rate of 0.1 following the same setting as in Aljundi et al. Mai et al. reported (2021) considerable and consistent performance gains when replacing the Softmax classifier with the NCM classifier.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found