Knowledge Distillation by On-the-Fly Native Ensemble
xu lan, Xiatian Zhu, Shaogang Gong
–Neural Information Processing Systems
Knowledge distillation is effective to train the small and generalisable network models for meeting the low-memory and fast running requirements. Existing offline distillation methods rely on a strong pre-trained teacher, which enables favourable knowledge discovery and transfer but requires a complex two-phase training procedure. Online counterparts address this limitation at the price of lacking a high-capacity teacher. In this work, we present an On-the-fly Native Ensemble (ONE) learning strategyforone-stage online distillation.
Neural Information Processing Systems
Feb-13-2026, 19:16:59 GMT
- Country:
- North America > Canada > Quebec > Montreal (0.04)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Vision (0.94)
- Information Technology > Artificial Intelligence