Shadow Knowledge Distillation: Bridging Offline and Online Knowledge Transfer Lujun Li

Neural Information Processing Systems 

KD aims to transfer knowledge from a high-capacity large model ( i.e., teacher) to a low-capacity

Similar Docs  Excel Report  more

TitleSimilaritySource
None found