How a student becomes a teacher: learning and forgetting through Spectral methods

Neural Information Processing Systems 

The above scheme proves particularly relevant when the student network is overparameterized (namely, when larger layer sizes are employed) as compared to the underlying teacher network. Under these operating conditions, it is tempting to speculate that the student ability to handle the given task could be eventually stored in a sub-portion of the whole network.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found