Exploring Feature-based Knowledge Distillation for Recommender System: A Frequency Perspective

Jan-13-2025–arXiv.org Artificial Intelligence

By defining To improve the inference efficiency without sacrificing accuracy, knowledge as different frequency components of the features, we many studies [10, 11, 13, 31] have adopted Knowledge Distillation theoretically demonstrate that regular feature-based knowledge distillation (KD) to recommender system. KD is a model-agnostic is equivalent to equally minimizing losses on all knowledge approach for model compression [6, 8]. In knowledge distillation and further analyze how this equal loss weight allocation method for recommendation, the common process is first to train a large leads to important knowledge being overlooked. In light of this, teacher model using the user-item interactions, then train a small we propose to emphasize important knowledge by redistributing student model using the user-item interactions as well as the features knowledge weights. Furthermore, we propose FreqD, a lightweight in the intermediate layer [10, 11, 13] and the predictions in knowledge reweighting method, to avoid the computational cost the output layer [1, 10, 15, 17] provided by the teacher model.

artificial intelligence, knowledge, machine learning, (13 more...)

arXiv.org Artificial Intelligence

Jan-13-2025

arXiv.org PDF

Add feedback

Country:
- North America > Canada (0.16)
- Asia > China (0.14)

Genre:
- Research Report
  - New Finding (0.67)
  - Experimental Study (0.46)

Industry:
- Education (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Personal Assistant Systems (1.00)
  - Machine Learning > Neural Networks (1.00)