Personal Assistant Systems
Fine-Tuning Out-of-Vocabulary Item Recommendation with User Sequence Imagination, Hao Chen
Recommending out-of-vocabulary (OOV) items is a challenging problem since the in-vocabulary (IV) items have well-trained behavioral embeddings but the OOV items only have content features. Current OOV recommendation models often generate'makeshift' embeddings for OOV items from content features and then jointly recommend with the'makeshift' OOV item embeddings and the behavioral IV item embeddings. However, merely using the'makeshift' embedding will result in suboptimal recommendation performance due to the substantial gap between the content feature and the behavioral embeddings. To bridge the gap, we propose a novel User Sequence IMagination (USIM) fine-tuning framework, which first imagines the user sequences and then refines the generated OOV embeddings with the user behavioral embeddings. Specifically, we frame the user sequence imagination as a reinforcement learning problem and develop a recommendationfocused reward function to evaluate to what extent a user can help recommend the OOV items. Besides, we propose an embedding-driven transition function to model the embedding transition after imaging a user. USIM has been deployed on a prominent e-commerce platform for months, offering recommendations for millions of OOV items and billions of users. Extensive experiments demonstrate that USIM outperforms traditional generative models in OOV item recommendation performance across traditional collaborative filtering and GNN-based collaborative filtering models.
10a3b1c30b8cceb507b9e8ddcc9a1a6a-Paper-Conference.pdf
Collaborative filtering (CF) has exhibited prominent results for recommender systems and been broadly utilized for real-world applications. A branch of research enhances CF methods by message passing (MP) used in graph neural networks, due to its strong capabilities of extracting knowledge from graph-structured data, like user-item bipartite graphs that naturally exist in CF. They assume that MP helps CF methods in a manner akin to its benefits for graph-based learning tasks in general (e.g., node classification). However, even though MP empirically improves CF, whether or not this assumption is correct still needs verification. To address this gap, we formally investigate why MP helps CF from multiple perspectives and show that many assumptions made by previous works are not entirely accurate.
Control Variates for Slate Off-Policy Evaluation WarnerMedia Fernando Amat Gil
We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates. The problem is common to recommender systems and user-interface optimization, and it is particularly challenging because of the combinatorially-sized action space. Swaminathan et al. (2017) have proposed the pseudoinverse (PI) estimator under the assumption that the conditional mean rewards are additive in actions. Using control variates, we consider a large class of unbiased estimators that includes as specific cases the PI estimator and (asymptotically) its self-normalized variant. By optimizing over this class, we obtain new estimators with risk improvement guarantees over both the PI and the self-normalized PI estimators.
Empowering Collaborative Filtering with Principled Adversarial Contrastive Loss
Contrastive Learning (CL) has achieved impressive performance in self-supervised learning tasks, showing superior generalization ability. Inspired by the success, adopting CL into collaborative filtering (CF) is prevailing in semi-supervised top-K recommendations. The basic idea is to routinely conduct heuristic-based data augmentation and apply contrastive losses (e.g., InfoNCE) on the augmented views. Yet, some CF-tailored challenges make this adoption suboptimal, such as the issue of out-of-distribution, the risk of false negatives, and the nature of top-K evaluation. They necessitate the CL-based CF scheme to focus more on mining hard negatives and distinguishing false negatives from the vast unlabeled user-item interactions, for informative contrast signals. Worse still, there is limited understanding of contrastive loss in CF methods, especially w.r.t.
End-to-end Learnable Clustering for Intent Learning in Recommendation
Intent learning, which aims to learn users' intents for user understanding and item recommendation, has become a hot research spot in recent years. However, existing methods suffer from complex and cumbersome alternating optimization, limiting performance and scalability. To this end, we propose a novel intent learning method termed ELCRec, by unifying behavior representation learning into an End-to-end Learnable Clustering framework, for effective and efficient Recommendation.
Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering
A.1 General Machine Learning Approaches Learning an implicit CF model from the positive-only data is also related to Positive-Unlabeled (PU) learning and learning from noisy labels, as the rest unobserved instances are unlabeled and noisy. Motivated by these general machine learning approaches, this paper formulates the negative sampling problem as efficient learning from unlabeled data with the presence of noisy labels, and pays more attention on those true negative instances hidden inside the massive unlabeled data. The following table and review on literatures discuss the differences between different approaches that can be adapted for this problem. Since implicit feedback data contain positive instances only, the implicit CF problem is also related to learning from positive-unlabeled (PU) data. PU learning formulates the problem as a binary classification, accounting for the fact that both positive and negative labels exist in the unlabeled data [13, 15, 23].