Goto

Collaborating Authors

 distillation method


4ec0b6648bdf487a2f1c815924339022-Paper-Conference.pdf

Neural Information Processing Systems

In knowledge distillation, previous feature distillation methods mainly focus on the design of loss functions and the selection of the distilled layers, while the effectofthefeatureprojector between thestudent andtheteacher remains underexplored.



Sequential Subset Matching for Dataset Distillation

Neural Information Processing Systems

The synthetic datasets are expected to capture the essence of the knowledge contained in real-world datasets such that the former yields a similar performance as the latter.



Improving the Training of Rectified Flows

Neural Information Processing Systems

One approach for tackling this problem is rectified flows, which iteratively learn smooth ODE paths that are less susceptible to truncation error. However, rectified flows still require a relatively large number of function evaluations (NFEs). In this work, we propose improved techniques for training rectified flows, allowing them to compete with knowledge distillation methods even in the low NFE setting.




TeachLess, LearnMore: OntheUndistillableClassesinKnowledgeDistillation

Neural Information Processing Systems

A counter-intuitive observation is that a more expansive teacher does not make a better student, but the reasons for this phenomenon remain unclear.