Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation Jiaming Lv, Haoyuan Y ang

Neural Information Processing Systems 

Since pioneering work of Hinton et al., knowledge distillation based on Kullback-Leibler Divergence (KL-Div) has been predominant, and recently its variants have

Similar Docs  Excel Report  more

TitleSimilaritySource
None found