Personal Assistant Systems
Spectral Thompson sampling
Kocak, Tomas, Valko, Michal, Munos, Remi, Agrawal, Shipra
Thompson Sampling (TS) has attracted a lot of interest due to its good empirical performance, in particular in the computational advertising. Though successful, the tools for its performance analysis appeared only recently. In this paper, we describe and analyze SpectralTS algorithm for a bandit problem, where the payoffs of the choices are smooth given an underlying graph. In this setting, each choice is a node of a graph and the expected payoffs of the neighboring nodes are assumed to be similar. Although the setting has application both in recommender systems and advertising, the traditional algorithms would scale poorly with the number of choices. For that purpose we consider an effective dimension d, which is small in real-world graphs. We deliver the analysis showing that the regret of SpectralTS scales as d*sqrt(T ln N) with high probability, where T is the time horizon and N is the number of choices. Since a d*sqrt(T ln N) regret is comparable to the known results, SpectralTS offers a computationally more efficient alternative. We also show that our algorithm is competitive on both synthetic and real-world data.
Unified Precision-Guaranteed Stopping Rules for Contextual Learning
Ding, Mingrui, Zhao, Qiuhong, Gao, Siyang, Dong, Jing
Contextual learning seeks to learn a decision policy that maps an individual's characteristics to an action through data collection. In operations management, such data may come from various sources, and a central question is when data collection can stop while still guaranteeing that the learned policy is sufficiently accurate. We study this question under two precision criteria: a context-wise criterion and an aggregate policy-value criterion. We develop unified stopping rules for contextual learning with unknown sampling variances in both unstructured and structured linear settings. Our approach is based on generalized likelihood ratio (GLR) statistics for pairwise action comparisons. To calibrate the corresponding sequential boundaries, we derive new time-uniform deviation inequalities that directly control the self-normalized GLR evidence and thus avoid the conservativeness caused by decoupling mean and variance uncertainty. Under the Gaussian sampling model, we establish finite-sample precision guarantees for both criteria. Numerical experiments on synthetic instances and two case studies demonstrate that the proposed stopping rules achieve the target precision with substantially fewer samples than benchmark methods. The proposed framework provides a practical way to determine when enough information has been collected in personalized decision problems. It applies across multiple data-collection environments, including historical datasets, simulation models, and real systems, enabling practitioners to reduce unnecessary sampling while maintaining a desired level of decision quality.
User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction
Hao, Yuren, Mehri, Shuhaib, Zhai, ChengXiang, Hakkani-Tรผr, Dilek
Large language models are increasingly used as personal assistants, yet most lack a persistent user model, forcing users to repeatedly restate preferences across sessions. We propose Vector-Adapted Retrieval Scoring (VARS), a pipeline-agnostic, frozen-backbone framework that represents each user with long-term and short-term vectors in a shared preference space and uses these vectors to bias retrieval scoring over structured preference memory. The vectors are updated online from weak scalar rewards from users' feedback, enabling personalization without per-user fine-tuning. We evaluate on \textsc{MultiSessionCollab}, an online multi-session collaboration benchmark with rich user preference profiles, across math and code tasks. Under frozen backbones, the main benefit of user-aware retrieval is improved interaction efficiency rather than large gains in raw task accuracy: our full VARS agent achieves the strongest overall performance, matches a strong Reflection baseline in task success, and reduces timeout rate and user effort. The learned long-term vectors also align with cross-user preference overlap, while short-term vectors capture session-specific adaptation, supporting the interpretability of the dual-vector design. Code, model, and data are available at https://github.com/YurenHao0426/VARS.
A Job I Like or a Job I Can Get: Designing Job Recommender Systems Using Field Experiments
Bied, Guillaume, Caillou, Philippe, Crรฉpon, Bruno, Gaillac, Christophe, Pรฉrennes, Elia, Sebag, Michรจle
Recommendation systems (RSs) are increasingly used to guide job seekers on online platforms, yet the algorithms currently deployed are typically optimized for predictive objectives such as clicks, applications, or hires, rather than job seekers' welfare. We develop a job-search model with an application stage in which the value of a vacancy depends on two dimensions: the utility it delivers to the worker and the probability that an application succeeds. The model implies that welfare-optimal RSs rank vacancies by an expected-surplus index combining both, and shows why rankings based solely on utility, hiring probabilities, or observed application behavior are generically suboptimal, an instance of the inversion problem between behavior and welfare. We test these predictions and quantify their practical importance through two randomized field experiments conducted with the French public employment service. The first experiment, comparing existing algorithms and their combinations, provides behavioral evidence that both dimensions shape application decisions. Guided by the model and these results, the second experiment extends the comparison to an RS designed to approximate the welfare-optimal ranking. The experiments generate exogenous variation in the vacancies shown to job seekers, allowing us to estimate the model, validate its behavioral predictions, and construct a welfare metric. Algorithms informed by the model-implied optimal ranking substantially outperform existing approaches and perform close to the welfare-optimal benchmark. Our results show that embedding predictive tools within a simple job-search framework and combining it with experimental evidence yields recommendation rules with substantial welfare gains in practice.
Preference Completion from Partial Rankings
Suriya Gunasekar, Oluwasanmi O. Koyejo, Joydeep Ghosh
We propose a novel and efficient algorithm for the collaborative preference completion problem, which involves jointly estimating individualized rankings for a set of entities over a shared set of items, based on a limited number of observed affinity values. Our approach exploits the observation that while preferences are often recorded as numerical scores, the predictive quantity of interest is the underlying rankings. Thus, attempts to closely match the recorded scores may lead to overfitting and impair generalization performance. Instead, we propose an estimator that directly fits the underlying preference order, combined with nuclear norm constraints to encourage low-rank parameters. Besides (approximate) correctness of the ranking order, the proposed estimator makes no generative assumption on the numerical scores of the observations. One consequence is that the proposed estimator can fit any consistent partial ranking over a subset of the items represented as a directed acyclic graph (DAG), generalizing standard techniques that can only fit preference scores. Despite this generality, for supervision representing total or blockwise total orders, the computational complexity of our algorithm is within a log factor of the standard algorithms for nuclear norm regularization based estimates for matrix completion.
Data Poisoning Attacks on Factorization-Based Collaborative Filtering
Bo Li, Yining Wang, Aarti Singh, Yevgeniy Vorobeychik
Recommendation and collaborative filtering systems are important in modern information and e-commerce applications. As these systems are becoming increasingly popular in the industry, their outputs could affect business decision making, introducing incentives for an adversarial party to compromise the availability or integrity of such systems. We introduce a data poisoning attack on collaborative filtering systems. We demonstrate how a powerful attacker with full knowledge of the learner can generate malicious data so as to maximize his/her malicious objectives, while at the same time mimicking normal user behavior to avoid being detected. While the complete knowledge assumption seems extreme, it enables a robust assessment of the vulnerability of collaborative filtering schemes to highly motivated attacks.
GE Profile Smart Grind and Brew Review: Just the Basics
This easy-to-use, Wi-Fi-enabled bean-to-cup brewer is good, but not quite great. App is simple and works well. "Smart" features only work with Amazon Alexa and Google Assistant. Integrating with HomeKit via third-party apps is not worth the effort. Pricey for what's essentially an auto-drip machine that works with an app, which is no longer novel or futuristic.
Coevolutionary Latent Feature Processes for Continuous-Time User-Item Interactions
Yichen Wang, Nan Du, Rakshit Trivedi, Le Song
Matching users to the right items at the right time is a fundamental task in recommendation systems. As users interact with different items over time, users' and items' feature may evolve and co-evolve over time. Traditional models based on static latent features or discretizing time into epochs can become ineffective for capturing the fine-grained temporal dynamics in the user-item interactions. We propose a coevolutionary latent feature process model that accurately captures the coevolving nature of users' and items' feature. To learn parameters, we design an efficient convex optimization algorithm with a novel low rank space sharing constraints. Extensive experiments on diverse real-world datasets demonstrate significant improvements in user behavior prediction compared to state-of-the-arts.