Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework

Open in new window