Improving Knowledge Distillation in Transfer Learning with Layer-wise Learning Rates

Open in new window