Goto

Collaborating Authors

 Lee, Jae-woong


Toward a Better Understanding of Loss Functions for Collaborative Filtering

arXiv.org Artificial Intelligence

Collaborative filtering (CF) is a pivotal technique in modern recommender systems. The learning process of CF models typically consists of three components: interaction encoder, loss function, and negative sampling. Although many existing studies have proposed various CF models to design sophisticated interaction encoders, recent work shows that simply reformulating the loss functions can achieve significant performance gains. This paper delves into analyzing the relationship among existing loss functions. Our mathematical analysis reveals that the previous loss functions can be interpreted as alignment and uniformity functions: (i) the alignment matches user and item representations, and (ii) the uniformity disperses user and item distributions. Inspired by this analysis, we propose a novel loss function that improves the design of alignment and uniformity considering the unique patterns of datasets called Margin-aware Alignment and Weighted Uniformity (MAWU). The key novelty of MAWU is two-fold: (i) margin-aware alignment (MA) mitigates user/item-specific popularity biases, and (ii) weighted uniformity (WU) adjusts the significance between user and item uniformities to reflect the inherent characteristics of datasets. Extensive experimental results show that MF and LightGCN equipped with MAWU are comparable or superior to state-of-the-art CF models with various loss functions on three public datasets.


uCTRL: Unbiased Contrastive Representation Learning via Alignment and Uniformity for Collaborative Filtering

arXiv.org Artificial Intelligence

Because implicit user feedback for the collaborative filtering (CF) models is biased toward popular items, CF models tend to yield recommendation lists with popularity bias. Previous studies have utilized inverse propensity weighting (IPW) or causal inference to mitigate this problem. However, they solely employ pointwise or pairwise loss functions and neglect to adopt a contrastive loss function for learning meaningful user and item representations. In this paper, we propose Unbiased ConTrastive Representation Learning (uCTRL), optimizing alignment and uniformity functions derived from the InfoNCE loss function for CF models. Specifically, we formulate an unbiased alignment function used in uCTRL. We also devise a novel IPW estimation method that removes the bias of both users and items. Despite its simplicity, uCTRL equipped with existing CF models consistently outperforms state-of-the-art unbiased recommender models, up to 12.22% for Recall@20 and 16.33% for NDCG@20 gains, on four benchmark datasets.


Collaborative Distillation for Top-N Recommendation

arXiv.org Machine Learning

--Knowledge distillation (KD) is a well-known method to reduce inference latency by compressing a cumbersome teacher model to a small student model. Despite the success of KD in the classification task, applying KD to recommender models is challenging due to the sparsity of positive feedback, the ambiguity of missing feedback, and the ranking problem associated with the top-N recommendation. T o address the issues, we propose a new KD model for the collaborative filtering approach, namely collaborative distillation ( CD). Specifically, (1) we reformulate a loss function to deal with the ambiguity of missing feedback. Via experimental results, we demonstrate that the proposed model outperforms the state-of-the-art method by 2.7-33.2% Moreover, the proposed model achieves the performance comparable to the teacher model. Neural recommender models [1]-[9] have achieved better performance than conventional latent factor models either by capturing nonlinear and complex correlation patterns among users/items, or by leveraging the hidden features extracted from auxiliary information such as texts and images. However, the number of model parameters of neural models is greater than that of conventional models by one or more orders of magnitude. This indicates a tradeoff between accuracy and efficiency. As a result, neural recommender models usually suffer from higher latency during the inference phase. Our primary goal is to develop a recommender model that achieves a balance between effectiveness and efficiency. In this paper, we employ knowledge distillation (KD) [10] which is a network compression technique by transferring the distilled knowledge of a large model (a.k.a., a teacher model) to a small model (a.k.a., a student model). As the student model can utilize the knowledge transferred from the teacher model, it naturally exhibits the properties of computational efficiency and low memory usage. Therefore, it is capable of achieving a balance between effectiveness and efficiency. Specifically, the training procedure for KD consists of two steps. In the offline training phase, the teacher model is supervised by a training dataset with labels.