Weighted Mutual Learning with Diversity-Driven Model Compression

Oct-10-2024, 23:13:02 GMT–Neural Information Processing Systems

Online distillation attracts attention from the community as it simplifies the traditional two-stage knowledge distillation process into a single stage. Online distillation collaboratively trains a group of peer models, which are treated as students, and all students gain extra knowledge from each other. However, memory consumption and diversity among peers are two key challenges to the scalability and quality of online distillation. To address the two challenges, this paper presents a framework called Weighted Mutual Learning with Diversity-Driven Model Compression (WML) for online distillation. First, at the base of a hierarchical structure where peers share different parts, we leverage the structured network pruning to generate diversified peer models and reduce the memory requirements.

distillation, diversity-driven model compression, weighted mutual learning, (1 more...)

Neural Information Processing Systems

Oct-10-2024, 23:13:02 GMT

Conferences Web Page

Add feedback

Genre:
- Play > Prospect > Container > Trap (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.40)