Goto

Collaborating Authors

 Personal Assistant Systems


LLM-KT: A Versatile Framework for Knowledge Transfer from Large Language Models to Collaborative Filtering

arXiv.org Artificial Intelligence

We present LLM-KT, a flexible framework designed to enhance collaborative filtering (CF) models by seamlessly integrating LLM (Large Language Model)-generated features. Unlike existing methods that rely on passing LLM-generated features as direct inputs, our framework injects these features into an intermediate layer of any CF model, allowing the model to reconstruct and leverage the embeddings internally. This model-agnostic approach works with a wide range of CF models without requiring architectural changes, making it adaptable to various recommendation scenarios. Our framework is built for easy integration and modification, providing researchers and developers with a powerful tool for extending CF model capabilities through efficient knowledge transfer. We demonstrate its effectiveness through experiments on the MovieLens and Amazon datasets, where it consistently improves baseline CF models. Experimental studies showed that LLM-KT is competitive with the state-of-the-art methods in context-aware settings but can be applied to a broader range of CF models than current approaches.


Biotic Browser: Applying StreamingLLM as a Persistent Web Browsing Co-Pilot

arXiv.org Artificial Intelligence

This paper presents "Biotic Browser," an innovative AI assistant leveraging StreamingLLM to transform web navigation and task execution. Characterized by its ability to simulate the experience of a passenger in an autonomous vehicle, the Biotic Browser excels in managing extended interactions and complex, multi-step web-based tasks. It marks a significant advancement in AI technology, particularly in the realm of long-term context management, and offers promising applications for enhancing productivity and efficiency in both personal and professional settings.


PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation

arXiv.org Artificial Intelligence

Softmax Loss (SL) is widely applied in recommender systems (RS) and has demonstrated effectiveness. This work analyzes SL from a pairwise perspective, revealing two significant limitations: 1) the relationship between SL and conventional ranking metrics like DCG is not sufficiently tight; 2) SL is highly sensitive to false negative instances. Our analysis indicates that these limitations are primarily due to the use of the exponential function. To address these issues, this work extends SL to a new family of loss functions, termed Pairwise Softmax Loss (PSL), which replaces the exponential function in SL with other appropriate activation functions. While the revision is minimal, we highlight three merits of PSL: 1) it serves as a tighter surrogate for DCG with suitable activation functions; 2) it better balances data contributions; and 3) it acts as a specific BPR loss enhanced by Distributionally Robust Optimization (DRO).


Help! I Wrote to Prudie for Advice and Leigh Bardugo Answered.

Slate

This special edition is part of our Guest Prudie series, where we ask smart, thoughtful people to step in as Prudie for the day and give you advice. Today's columnist is number one New York Times-bestselling author Leigh Bardugo. She is the author of the books The Familiar, Ninth House and the creator of the Grishaverse (now a Netflix original series) which spans the Shadow and Bone trilogy, the Six of Crows duology, the King of Scars duology. Her short fiction has appeared in multiple anthologies including The Best American Science Fiction and Fantasy. She lives in Los Angeles and is an associate fellow of Pauli Murray College at Yale University. We asked Bardugo to weigh in on "romantic" gestures gone wrong, conversational vampires, and vocal dogs: I recently met a man on a dating app. We hit it off quickly. We were texting all of the time about work, writing, and the world--often getting pretty flirty. I was having tons of fun. He was charming and seemed to me conspicuously brilliant.


ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning

arXiv.org Artificial Intelligence

This paper presents ReasoningRec, a reasoning-based recommendation framework that leverages Large Language Models (LLMs) to bridge the gap between recommendations and human-interpretable explanations. In contrast to conventional recommendation systems that rely on implicit user-item interactions, ReasoningRec employs LLMs to model users and items, focusing on preferences, aversions, and explanatory reasoning. The framework utilizes a larger LLM to generate synthetic explanations for user preferences, subsequently used to fine-tune a smaller LLM for enhanced recommendation accuracy and human-interpretable explanation. Our experimental study investigates the impact of reasoning and contextual information on personalized recommendations, revealing that the quality of contextual and personalized data significantly influences the LLM's capacity to generate plausible explanations. Empirical evaluations demonstrate that ReasoningRec surpasses state-of-the-art methods by up to 12.5\% in recommendation prediction while concurrently providing human-intelligible explanations. The code is available here: https://github.com/millenniumbismay/reasoningrec.


Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

arXiv.org Artificial Intelligence

Recent advancements in recommender systems have focused on leveraging Large Language Models (LLMs) to improve user preference modeling, yielding promising outcomes. However, current LLM-based approaches struggle to fully leverage user behavior sequences, resulting in suboptimal preference modeling for personalized recommendations. In this study, we propose a novel Counterfactual Fine-Tuning (CFT) method to address this issue by explicitly emphasizing the role of behavior sequences when generating recommendations. Specifically, we employ counterfactual reasoning to identify the causal effects of behavior sequences on model output and introduce a task that directly fits the ground-truth labels based on these effects, achieving the goal of explicit emphasis. Additionally, we develop a token-level weighting mechanism to adjust the emphasis strength for different item tokens, reflecting the diminishing influence of behavior sequences from earlier to later tokens during predicting an item. Extensive experiments on real-world datasets demonstrate that CFT effectively improves behavior sequence modeling. Our codes are available at https://github.com/itsmeyjt/CFT.


Dual Contrastive Transformer for Hierarchical Preference Modeling in Sequential Recommendation

arXiv.org Artificial Intelligence

Sequential recommender systems (SRSs) aim to predict the subsequent items which may interest users via comprehensively modeling users' complex preference embedded in the sequence of user-item interactions. However, most of existing SRSs often model users' single low-level preference based on item ID information while ignoring the high-level preference revealed by item attribute information, such as item category. Furthermore, they often utilize limited sequence context information to predict the next item while overlooking richer inter-item semantic relations. To this end, in this paper, we proposed a novel hierarchical preference modeling framework to substantially model the complex low- and high-level preference dynamics for accurate sequential recommendation. Specifically, in the framework, a novel dual-transformer module and a novel dual contrastive learning scheme have been designed to discriminatively learn users' low- and high-level preference and to effectively enhance both low- and high-level preference learning respectively. In addition, a novel semantics-enhanced context embedding module has been devised to generate more informative context embedding for further improving the recommendation performance. Extensive experiments on six real-world datasets have demonstrated both the superiority of our proposed method over the state-of-the-art ones and the rationality of our design.


Google TV Streamer review: A great side piece for your TV, with a dash of smart home chops and (inessential) AI

Engadget

What we once called the Google Chromecast (and then the Chromecast with Google TV) is now the Google TV Streamer. I won't pretend to understand the reasoning behind any product's rebrand, but at least this one makes a bit of sense. Casting content from elsewhere used to be a big reason TV dongles existed. Today, streaming devices primarily provide the brains required to watch content from Netflix, Disney and other streaming services on almost any screen, and casting is a bit of an afterthought. A name that focuses on Google TV's interface instead of casting seems right in 2024.


Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs

arXiv.org Artificial Intelligence

Personalized recommendation is a ubiquitous application on the internet, with many industries and hyperscalers extensively leveraging Deep Learning Recommendation Models (DLRMs) for their personalization needs (like ad serving or movie suggestions). With growing model and dataset sizes pushing computation and memory requirements, GPUs are being increasingly preferred for executing DLRM inference. However, serving newer DLRMs, while meeting acceptable latencies, continues to remain challenging, making traditional deployments increasingly more GPU-hungry, resulting in higher inference serving costs. In this paper, we show that the embedding stage continues to be the primary bottleneck in the GPU inference pipeline, leading up to a 3.2x embedding-only performance slowdown. To thoroughly grasp the problem, we conduct a detailed microarchitecture characterization and highlight the presence of low occupancy in the standard embedding kernels. By leveraging direct compiler optimizations, we achieve optimal occupancy, pushing the performance by up to 53%. Yet, long memory latency stalls continue to exist. To tackle this challenge, we propose specialized plug-and-play-based software prefetching and L2 pinning techniques, which help in hiding and decreasing the latencies. Further, we propose combining them, as they complement each other. Experimental evaluations using A100 GPUs with large models and datasets show that our proposed techniques improve performance by up to 103% for the embedding stage, and up to 77% for the overall DLRM inference pipeline.


Dual Conditional Diffusion Models for Sequential Recommendation

arXiv.org Artificial Intelligence

Recent advancements in diffusion models have shown promising results in sequential recommendation (SR). However, current diffusion-based methods still exhibit two key limitations. First, they implicitly model the diffusion process for target item embeddings rather than the discrete target item itself, leading to inconsistency in the recommendation process. Second, existing methods rely on either implicit or explicit conditional diffusion models, limiting their ability to fully capture the context of user behavior and leading to less robust target item embeddings. In this paper, we propose the Dual Conditional Diffusion Models for Sequential Recommendation (DCRec), introducing a discrete-to-continuous sequential recommendation diffusion framework. Our framework introduces a complete Markov chain to model the transition from the reversed target item representation to the discrete item index, bridging the discrete and continuous item spaces for diffusion models and ensuring consistency with the diffusion framework. Building on this framework, we present the Dual Conditional Diffusion Transformer (DCDT) that incorporates the implicit conditional and the explicit conditional for diffusion-based SR. Extensive experiments on public benchmark datasets demonstrate that DCRec outperforms state-of-the-art methods.