Goto

Collaborating Authors

 cifar-100



Class-IncrementalLearningviaDualAugmentation

Neural Information Processing Systems

Typically, DNNs suffer from drastic performance degradation of previously learned tasksafterlearning newknowledge, which isawell-documented phenomenon, knownascatastrophic forgetting [8,9,10].


4ec0b6648bdf487a2f1c815924339022-Paper-Conference.pdf

Neural Information Processing Systems

In knowledge distillation, previous feature distillation methods mainly focus on the design of loss functions and the selection of the distilled layers, while the effectofthefeatureprojector between thestudent andtheteacher remains underexplored.





A Experimental setup

Neural Information Processing Systems

In this section, we detail the model architectures examined in the experiments and list all hyperpa-rameters used in the experiments. Both architectures consist of five stages, each consisting of a combination of convolutional layers with ReLU activation and max pooling layers. The base number of channels in consecutive stages for VGG architectures equals 64, 128, 256, 512, and 512. The subsequent stages are composed of residual blocks. In the case of ResNets, we report the results for the'conv2' layers.