Personal Assistant Systems
Effective and secure federated online learning to rank
Online Learning to Rank (OLTR) optimises ranking models using implicit user feedback, such as clicks. Unlike traditional Learning to Rank (LTR) methods that rely on a static set of training data with relevance judgements to learn a ranking model, OLTR methods update the model continually as new data arrives. Thus, it addresses several drawbacks such as the high cost of human annotations, potential misalignment between user preferences and human judgments, and the rapid changes in user query intents. However, OLTR methods typically require the collection of searchable data, user queries, and clicks, which poses privacy concerns for users. Federated Online Learning to Rank (FOLTR) integrates OLTR within a Federated Learning (FL) framework to enhance privacy by not sharing raw data. While promising, FOLTR methods currently lag behind traditional centralised OLTR due to challenges in ranking effectiveness, robustness with respect to data distribution across clients, susceptibility to attacks, and the ability to unlearn client interactions and data. This thesis presents a comprehensive study on Federated Online Learning to Rank, addressing its effectiveness, robustness, security, and unlearning capabilities, thereby expanding the landscape of FOLTR.
It Was a Record Year for Dating Apps. They Still Don't Have It Figured Out
Of all things about dating that people got wrong in 2024, one remains the standout: that old people don't have a lot of sex. On several dating platforms, boomers (individuals aged 59 to 72) were actually the fastest growing userbase. Aging singles were also having the most orgasmic sex of their lives, according to data from Match.com, Since 2022, the kink-positive app Feeld has experienced a 340 percent surge in users who are 60-plus. "Feeld has definitely introduced me to new desires and made me 100 percent more aware of my body and what I enjoy," Wendy, 72, said when we spoke in April.
Should You Divulge a Disability in Your Dating Profile?
Todd is looking for love, but he's unsure about disclosing something in dating profiles: his multiple sclerosis. On Slate's How To podcast, Todd got some crucial advice from Jessica Slice and Caroline Cupp, authors of Dateable: Swiping Right, Hooking Up, and Settling Down While Chronically Ill and Disabled. This week, we're sharing that wonderful episode with Death, Sex & Money listeners, and to kick things off, Anna talks to Carvell Wallace (the host of How To) about what makes this episode special. Listeners may remember Carvell from his appearance on DSM earlier this year. Do you have a problem that needs solving?
Optimization and Scalability of Collaborative Filtering Algorithms in Large Language Models
Yang, Haowei, Yun, Longfei, Cao, Jinghan, Lu, Qingyi, Tu, Yuming
Collaborative filtering (CF) is one of the most widely adopted algorithms in recommendation systems due to its ability to generate personalized recommendations based on user behavior data. However, the rapid growth in data volume and model complexity poses significant challenges to traditional collaborative filtering algorithms[2]. These include high computational overhead, data sparsity, the cold start problem, and difficulty in scaling.In the context of LLM-based recommendation systems, these challensges are further amplified due to the intricate interactions between users, content, and language model parameters. This research explores the optimization and scalability of collaborative filtering algorithms within large language models. We propose several optimization strategies, including matrix factorization, approximate nearest neighbor search, and parallel computing, to reduce computational complexity and improve accuracy[3].This work builds on insights from [4], particularly its integration of neural matrix factorization with large language models to address cold start issues and improve recommendation accuracy through multimodal data.The multimodal fusion strategies and transformer-based methods in [5] provide valuable insights for improving data integration and scalability in collaborative filtering algorithms.The key insight from [6] is their approach to handling data imbalance and scalability, which is highly relevant for optimizing collaborative filtering algorithms in large language model-based recommendation systems.The use of CNNs and LSTMs in [7] for capturing nonlinear patterns informs optimizing collaborative filtering algorithms in LLM-based systems, improving efficiency and accuracy.
Adaptive Self-supervised Learning for Social Recommendations
He, Xin, Lin, Shanru, Fan, Wenqi, Sun, Mingchen, Wang, Ying, Wang, Xin
In recent years, researchers have attempted to exploit social relations to improve the performance in recommendation systems. Generally, most existing social recommendation methods heavily depends on substantial domain knowledge and expertise in primary recommendation tasks for designing useful auxiliary tasks. Meanwhile, Self-Supervised Learning (SSL) recently has received considerable attention in the field of recommendation, since it can provide self-supervision signals in assisting the improvement of target recommendation systems by constructing self-supervised auxiliary tasks from raw data without human-annotated labels. Despite the great success, these SSL-based social recommendations are insufficient to adaptively balance various self-supervised auxiliary tasks, since assigning equal weights on various auxiliary tasks can result in sub-optimal recommendation performance, where different self-supervised auxiliary tasks may contribute differently to improving the primary social recommendation across different datasets. To address this issue, in this work, we propose Adaptive Self-supervised Learning for Social Recommendations (AdasRec) by taking advantage of various self-supervised auxiliary tasks. More specifically, an adaptive weighting mechanism is proposed to learn adaptive weights for various self-supervised auxiliary tasks, so as to balance the contribution of such self-supervised auxiliary tasks for enhancing representation learning in social recommendations. The adaptive weighting mechanism is used to assign different weights on auxiliary tasks to achieve an overall weighting of the entire auxiliary tasks and ultimately assist the primary recommendation task, achieved by a meta learning optimization problem with an adaptive weighting network. Comprehensive experiments on various real-world datasets are constructed to verify the effectiveness of our proposed method.
Enhanced Recommendation Combining Collaborative Filtering and Large Language Models
Lin, Xueting, Cheng, Zhan, Yun, Longfei, Lu, Qingyi, Luo, Yuanshuai
With the advent of the information explosion era, the importance of recommendation systems in various applications is increasingly significant. Traditional collaborative filtering algorithms are widely used due to their effectiveness in capturing user behavior patterns, but they encounter limitations when dealing with cold start problems and data sparsity. Large Language Models (LLMs), with their strong natural language understanding and generation capabilities, provide a new breakthrough for recommendation systems. This study proposes an enhanced recommendation method that combines collaborative filtering and LLMs, aiming to leverage collaborative filtering's advantage in modeling user preferences while enhancing the understanding of textual information about users and items through LLMs to improve recommendation accuracy and diversity. This paper first introduces the fundamental theories of collaborative filtering and LLMs, then designs a recommendation system architecture that integrates both, and validates the system's effectiveness through experiments. The results show that the hybrid model based on collaborative filtering and LLMs significantly improves precision, recall, and user satisfaction, demonstrating its potential in complex recommendation scenarios.
MixRec: Heterogeneous Graph Collaborative Filtering
Xia, Lianghao, Xie, Meiyan, Xu, Yong, Huang, Chao
For modern recommender systems, the use of low-dimensional latent representations to embed users and items based on their observed interactions has become commonplace. However, many existing recommendation models are primarily designed for coarse-grained and homogeneous interactions, which limits their effectiveness in two critical dimensions. Firstly, these models fail to leverage the relational dependencies that exist across different types of user behaviors, such as page views, collects, comments, and purchases. Secondly, they struggle to capture the fine-grained latent factors that drive user interaction patterns. To address these limitations, we present a heterogeneous graph collaborative filtering model MixRec that excels at disentangling users' multi-behavior interaction patterns and uncovering the latent intent factors behind each behavior. Our model achieves this by incorporating intent disentanglement and multi-behavior modeling, facilitated by a parameterized heterogeneous hypergraph architecture. Furthermore, we introduce a novel contrastive learning paradigm that adaptively explores the advantages of self-supervised data augmentation, thereby enhancing the model's resilience against data sparsity and expressiveness with relation heterogeneity. To validate the efficacy of MixRec, we conducted extensive experiments on three public datasets. The results clearly demonstrate its superior performance, significantly outperforming various state-of-the-art baselines. Our model is open-sourced and available at: https://github.com/HKUDS/MixRec.
The Value of AI-Generated Metadata for UGC Platforms: Evidence from a Large-scale Field Experiment
Zhang, Xinyi, Sun, Chenshuo, Zhang, Renyu, Goh, Khim-Yong
AI-generated content (AIGC), such as advertisement copy, product descriptions, and social media posts, is becoming ubiquitous in business practices. However, the value of AI-generated metadata, such as titles, remains unclear on user-generated content (UGC) platforms. To address this gap, we conducted a large-scale field experiment on a leading short-video platform in Asia to provide about 1 million users access to AI-generated titles for their uploaded videos. Our findings show that the provision of AI-generated titles significantly boosted content consumption, increasing valid watches by 1.6% and watch duration by 0.9%. When producers adopted these titles, these increases jumped to 7.1% and 4.1%, respectively. This viewership-boost effect was largely attributed to the use of this generative AI (GAI) tool increasing the likelihood of videos having a title by 41.4%. The effect was more pronounced for groups more affected by metadata sparsity. Mechanism analysis revealed that AI-generated metadata improved user-video matching accuracy in the platform's recommender system. Interestingly, for a video for which the producer would have posted a title anyway, adopting the AI-generated title decreased its viewership on average, implying that AI-generated titles may be of lower quality than human-generated ones. However, when producers chose to co-create with GAI and significantly revised the AI-generated titles, the videos outperformed their counterparts with either fully AI-generated or human-generated titles, showcasing the benefits of human-AI co-creation. This study highlights the value of AI-generated metadata and human-AI metadata co-creation in enhancing user-content matching and content consumption for UGC platforms.
An Automatic Graph Construction Framework based on Large Language Models for Recommendation
Shan, Rong, Lin, Jianghao, Zhu, Chenxu, Chen, Bo, Zhu, Menghui, Zhang, Kangning, Zhu, Jieming, Tang, Ruiming, Yu, Yong, Zhang, Weinan
Graph neural networks (GNNs) have emerged as state-of-the-art methods to learn from graph-structured data for recommendation. However, most existing GNN-based recommendation methods focus on the optimization of model structures and learning strategies based on pre-defined graphs, neglecting the importance of the graph construction stage. Earlier works for graph construction usually rely on speciffic rules or crowdsourcing, which are either too simplistic or too labor-intensive. Recent works start to utilize large language models (LLMs) to automate the graph construction, in view of their abundant open-world knowledge and remarkable reasoning capabilities. Nevertheless, they generally suffer from two limitations: (1) invisibility of global view (e.g., overlooking contextual information) and (2) construction inefficiency. To this end, we introduce AutoGraph, an automatic graph construction framework based on LLMs for recommendation. Specifically, we first use LLMs to infer the user preference and item knowledge, which is encoded as semantic vectors. Next, we employ vector quantization to extract the latent factors from the semantic vectors. The latent factors are then incorporated as extra nodes to link the user/item nodes, resulting in a graph with in-depth global-view semantics. We further design metapath-based message aggregation to effectively aggregate the semantic and collaborative information. The framework is model-agnostic and compatible with different backbone models. Extensive experiments on three real-world datasets demonstrate the efficacy and efffciency of AutoGraph compared to existing baseline methods. We have deployed AutoGraph in Huawei advertising platform, and gain a 2.69% improvement on RPM and a 7.31% improvement on eCPM in the online A/B test. Currently AutoGraph has been used as the main trafffc model, serving hundreds of millions of people.
Meta is reportedly adding displays to its Ray-Ban smart glasses
It looks like Meta is preparing to add displays to its popular line of Ray-Ban smart glasses, according to a report by Financial Times. These screens could show up in a future iteration of the device as early as next year. The likely release window is the second half of 2025. According to folks familiar with Meta's plans, the screens will be on the smaller side and will likely be used to display notifications or responses from Meta's AI virtual assistant. It's highly unlikely that the company is planning on making this a full mixed-reality device just yet.