Goto

Collaborating Authors

 Personal Assistant Systems


Inductive Transfer Learning for Graph-Based Recommenders

arXiv.org Artificial Intelligence

Graph-based recommender systems are commonly trained in transductive settings, which limits their applicability to new users, items, or datasets. We propose NBF-Rec, a graph-based recommendation model that supports inductive transfer learning across datasets with disjoint user and item sets. Unlike conventional embedding-based methods that require retraining for each domain, NBF-Rec computes node embeddings dynamically at inference time. We evaluate the method on seven real-world datasets spanning movies, music, e-commerce, and location check-ins. NBF-Rec achieves competitive performance in zero-shot settings, where no target domain data is used for training, and demonstrates further improvements through lightweight fine-tuning. These results show that inductive transfer is feasible in graph-based recommendation and that interaction-level message passing supports generalization across datasets without requiring aligned users or items.


SteerX: Disentangled Steering for LLM Personalization

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown remarkable success in recent years, enabling a wide range of applications, including intelligent assistants that support users' daily life and work. A critical factor in building such assistants is personalizing LLMs, as user preferences and needs vary widely. Activation steering, which directly leverages directions representing user preference in the LLM activation space to adjust its behavior, offers a cost-effective way to align the model's outputs with individual users. However, existing methods rely on all historical data to compute the steering vector, ignoring that not all content reflects true user preferences, which undermines the personalization signal. To address this, we propose SteerX, a disentangled steering method that isolates preference-driven components from preference-agnostic components. Grounded in causal inference theory, SteerX estimates token-level causal effects to identify preference-driven tokens, transforms these discrete signals into a coherent description, and then leverages them to steer personalized LLM generation. By focusing on the truly preference-driven information, SteerX produces more accurate activation steering vectors and enhances personalization. Experiments on two representative steering backbone methods across real-world datasets demonstrate that SteerX consistently enhances steering vector quality, offering a practical solution for more effective LLM personalization.


Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

arXiv.org Artificial Intelligence

Modern large-scale recommendation systems rely heavily on user interaction history sequences to enhance the model performance. The advent of large language models and sequential modeling techniques, particularly transformer-like architectures, has led to significant advancements recently (e.g., HSTU, SIM, and TWIN models). While scaling to ultra-long user histories (10k to 100k items) generally improves model performance, it also creates significant challenges on latency, queries per second (QPS) and GPU cost in industry-scale recommendation systems. Existing models do not adequately address these industrial scalability issues. In this paper, we propose a novel two-stage modeling framework, namely VIrtual Sequential Target Attention (VISTA), which decomposes traditional target attention from a candidate item to user history items into two distinct stages: (1) user history summarization into a few hundred tokens; followed by (2) candidate item attention to those tokens. These summarization token embeddings are then cached in storage system and then utilized as sequence features for downstream model training and inference. This novel design for scalability enables VISTA to scale to lifelong user histories (up to one million items) while keeping downstream training and inference costs fixed, which is essential in industry. Our approach achieves significant improvements in offline and online metrics and has been successfully deployed on an industry leading recommendation platform serving billions of users.


Unifying Inductive, Cross-Domain, and Multimodal Learning for Robust and Generalizable Recommendation

arXiv.org Artificial Intelligence

Recommender systems have long been built upon the modeling of interactions between users and items, while recent studies have sought to broaden this paradigm by generalizing to new users and items, incorporating diverse information sources, and transferring knowledge across domains. Nevertheless, these efforts have largely focused on individual aspects, hindering their ability to tackle the complex recommendation scenarios that arise in daily consumptions across diverse domains. In this paper, we present MICRec, a unified framework that fuses inductive modeling, multimodal guidance, and cross-domain transfer to capture user contexts and latent preferences in heterogeneous and incomplete real-world data. Moving beyond the inductive backbone of INMO, our model refines expressive representations through modality-based aggregation and alleviates data sparsity by leveraging overlapping users as anchors across domains, thereby enabling robust and generalizable recommendation. Experiments show that MICRec outperforms 12 baselines, with notable gains in domains with limited training data.


DiffGRM: Diffusion-based Generative Recommendation Model

arXiv.org Artificial Intelligence

Generative recommendation (GR) is an emerging paradigm that represents each item via a tokenizer as an n-digit semantic ID (SID) and predicts the next item by autoregressively generating its SID conditioned on the user's history. However, two structural properties of SIDs make ARMs ill-suited. First, intra-item consistency: the n digits jointly specify one item, yet the left-to-right causality trains each digit only under its prefix and blocks bidirectional cross-digit evidence, collapsing supervision to a single causal path. Second, inter-digit heterogeneity: digits differ in semantic granularity and predictability, while the uniform next-token objective assigns equal weight to all digits, overtraining easy digits and undertraining hard digits. To address these two issues, we propose DiffGRM, a diffusion-based GR model that replaces the autoregressive decoder with a masked discrete diffusion model (MDM), thereby enabling bidirectional context and any-order parallel generation of SID digits for recommendation. Specifically, we tailor DiffGRM in three aspects: (1) tokenization with Parallel Semantic Encoding (PSE) to decouple digits and balance per-digit information; (2) training with On-policy Coherent Noising (OCN) that prioritizes uncertain digits via coherent masking to concentrate supervision on high-value signals; and (3) inference with Confidence-guided Parallel Denoising (CPD) that fills higher-confidence digits first and generates diverse Top-K candidates. Experiments show consistent gains over strong generative and discriminative recommendation baselines on multiple datasets, improving NDCG@10 by 6.9%-15.5%. Code is available at https://github.com/liuzhao09/DiffGRM.


Modeling Bias Evolution in Fashion Recommender Systems: A System Dynamics Approach

arXiv.org Artificial Intelligence

Bias in recommender systems not only distorts user experience but also perpetuates and amplifies existing societal stereotypes, particularly in sectors like fashion e-commerce. This study employs a dynamic modeling approach to scrutinize the mechanisms of bias activation and reinforcement within Fashion Recommender Systems (FRS). By leveraging system dynamics modeling and experimental simulations, we dissect the temporal evolution of bias and its multifaceted impacts on system performance. Our analysis reveals that inductive biases exert a more substantial influence on system outcomes than user biases, suggesting critical areas for intervention. We demonstrate that while current debiasing strategies, including data rebalancing and algorithmic regularization, are effective to an extent, they require further enhancement to comprehensively mitigate biases. This research underscores the necessity for advancing these strategies and extending system boundaries to incorporate broader contextual factors such as user demographics and item diversity, aiming to foster inclusivity and fairness in FRS. The findings advocate for a proactive approach in recommender system design to counteract bias propagation and ensure equitable user experiences.


Words to Waves: Emotion-Adaptive Music Recommendation System

arXiv.org Artificial Intelligence

Current recommendation systems often tend to overlook emotional context and rely on historical listening patterns or static mood tags. This paper introduces a novel music recommendation framework employing a variant of Wide and Deep Learning architecture that takes in real-time emotional states inferred directly from natural language as inputs and recommends songs that closely portray the mood. The system captures emotional contexts from user-provided textual descriptions by using transformer-based embeddings, which were finetuned to predict the emotional dimensions of valence-arousal. The deep component of the architecture utilizes these embeddings to generalize unseen emotional patterns, while the wide component effectively memorizes user-emotion and emotion-genre associations through cross-product features. Experimental results show that personalized music selections positively influence the user's emotions and lead to a significant improvement in emotional relevance.


Generalized Top-k Mallows Model for Ranked Choices

arXiv.org Machine Learning

The classic Mallows model is a foundational tool for modeling user preferences. However, it has limitations in capturing real-world scenarios, where users often focus only on a limited set of preferred items and are indifferent to the rest. To address this, extensions such as the top-k Mallows model have been proposed, aligning better with practical applications. In this paper, we address several challenges related to the generalized top-k Mallows model, with a focus on analyzing buyer choices. Our key contributions are: (1) a novel sampling scheme tailored to generalized top-k Mallows models, (2) an efficient algorithm for computing choice probabilities under this model, and (3) an active learning algorithm for estimating the model parameters from observed choice data. These contributions provide new tools for analysis and prediction in critical decision-making scenarios. We present a rigorous mathematical analysis for the performance of our algorithms. Furthermore, through extensive experiments on synthetic data and real-world data, we demonstrate the scalability and accuracy of our proposed methods, and we compare the predictive power of Mallows model for top-k lists compared to the simpler Multinomial Logit model.


A Short Note on Upper Bounds for Graph Neural Operator Convergence Rate

arXiv.org Machine Learning

ABSTRACT Graphons, as limits of graph sequences, provide a framework for analyzing the asymptotic behavior of graph neural operators. Spectral convergence of sampled graphs to graphons yields operator-level convergence rates, enabling transferability analyses of GNNs. This note summarizes known bounds under no assumptions, global Lipschitz continuity, and piecewise-Lipschitz continuity, highlighting tradeoffs between assumptions and rates, and illustrating their empirical tightness on synthetic and real data. Index T erms-- graph neural operator, graphon, convergence rates, graph neural networks, transferability 1. INTRODUCTION Graph neural networks (GNNs) are widely used in drug discovery [1, 2], social networks [3, 4], recommendation systems [5], and NLP [6, 7, 8]. GNNs operate on graph-structured data via message passing and aggregation [9], but training on large graphs is computationally expensive.


Towards Explainable Personalized Recommendations by Learning from Users' Photos

arXiv.org Artificial Intelligence

Explaining the output of a complex system, such as a Recommender System (RS), is becoming of utmost importance for both users and companies. In this paper we explore the idea that personalized explanations can be learned as recommendation themselves. There are plenty of online services where users can upload some photos, in addition to rating items. We assume that users take these photos to reinforce or justify their opinions about the items. For this reason we try to predict what photo a user would take of an item, because that image is the argument that can best convince her of the qualities of the item. In this sense, an RS can explain its results and, therefore, increase its reliability. Furthermore, once we have a model to predict attractive images for users, we can estimate their distribution. The paper includes a formal framework that estimates the authorship probability for a given pair (user, photo). To illustrate the proposal, we use data gathered from TripAdvisor containing the reviews (with photos) of restaurants in six cities of different sizes. Keywords: Recommender Systems, Personalization, Explainability, Photo, Collaborative 1. Introduction Explainable Artificial Intelligence (XAI) is becoming an important area of interest since explainability is increasingly necessary to meet stakeholder demands. In particular, the General Data Protection Regulation (GDPR) [29] of the European Union demands transparency in systems that take decisions affecting people, making explanations more needed than ever. Additionally, explanations may help increase the trust of users in AI algorithms, since people rely not only on their efficacy but also on the degree of understanding of the process they follow. Since they provide suggestions to users, explainability plays an important role on them.