Personal Assistant Systems
An Efficient Recommendation Model Based on Knowledge Graph Attention-Assisted Network (KGATAX)
Recommendation systems play a crucial role in helping users filter through vast amounts of information. However, traditional recommendation algorithms often overlook the integration and utilization of multi-source information, limiting system performance. Therefore, this study proposes a novel recommendation model, Knowledge Graph Attention-assisted Network (KGAT-AX). We first incorporate the knowledge graph into the recommendation model, introducing an attention mechanism to explore higher order connectivity more explicitly. By using multilayer interactive information propagation, the model aggregates information to enhance its generalization ability. Furthermore, we integrate auxiliary information into entities through holographic embeddings, aggregating the information of adjacent entities for each entity by learning their inferential relationships. This allows for better utilization of auxiliary information associated with entities. We conducted experiments on real datasets to demonstrate the rationality and effectiveness of the KGAT-AX model. Through experimental analysis, we observed the effectiveness and potential of KGAT-AX compared to other baseline models on public datasets. KGAT-AX demonstrates better knowledge information capture and relationship learning capabilities.
Bezos' Amazon and Blue Origin back Harris as Alexa gushes over VP
OutKick host Tomi Lahren sounds off on viral video of Amazon Alexa praising Harris while refusing to speak on why its users should support Trump. While videos circulate online of Amazon's Alexa giving vastly different answers when asked to make a quick argument for voting for Vice President Harris versus voting for former President Trump, federal donor records show the tech giant's employees have a clear favorite. A review of a Federal Election Commission database shows employees at Jeff Bezos' companies Amazon and Blue Origin have contributed significantly to VP Kamala Harris' campaign. According to the nonpartisan research group OpenSecrets, Amazon donors have contributed 1,000,140 to Vice President Kamala Harris during the 2024 election cycle. Blue Origin workers donated much less to the Harris' campaign, at roughly 27,000.
Lindsey Graham puts Amazon 'on notice' over Alexa's potential election interference
OutKick host Tomi Lahren sounds off on viral video of Amazon Alexa praising Harris while refusing to speak on why its users should support Trump. FIRST ON FOX: Sen. Lindsey Graham, R-S.C., issued a stern warning to Amazon after its virtual assistant technology, Alexa, was found to be politically biased in favor of Vice President Kamala Harris over former President Trump. Graham, the ranking member of the Senate Committee on the Judiciary, told Amazon president and CEO Andrew Jassy in a letter he was putting him "on notice that I will not allow this to go unaddressed." On Tuesday, videos of interactions with Alexa went viral as the technology responded to queries on why someone should vote for Harris or why they should vote for Trump. Sen. Lindsey Graham, R-S.C., demanded an explanation from Amazon after its Alexa technology gave biased answers.
Deep Adaptive Interest Network: Personalized Recommendation with Context-Aware Learning
Huang, Shuaishuai, Yang, Haowei, Yao, You, Lin, Xueting, Tu, Yuming
In personalized recommendation systems, accurately capturing users' evolving interests and combining them with contextual information is a critical research area. This paper proposes a novel model called the Deep Adaptive Interest Network (DAIN), which dynamically models users' interests while incorporating context-aware learning mechanisms to achieve precise and adaptive personalized recommendations. DAIN leverages deep learning techniques to build an adaptive interest network structure that can capture users' interest changes in real-time while further optimizing recommendation results by integrating contextual information. Experiments conducted on several public datasets demonstrate that DAIN excels in both recommendation performance and computational efficiency. This research not only provides a new solution for personalized recommendation systems but also offers fresh insights into the application of context-aware learning in recommendation systems.
A Fashion Item Recommendation Model in Hyperbolic Space
Shimizu, Ryotaro, Wang, Yu, Kimura, Masanari, Hirakawa, Yuki, Wada, Takashi, Saito, Yuki, McAuley, Julian
In this work, we propose a fashion item recommendation model that incorporates hyperbolic geometry into user and item representations. Using hyperbolic space, our model aims to capture implicit hierarchies among items based on their visual data and users' purchase history. During training, we apply a multi-task learning framework that considers both hyperbolic and Euclidean distances in the loss function. Our experiments on three data sets show that our model performs better than previous models trained in Euclidean space only, confirming the effectiveness of our model. Our ablation studies show that multi-task learning plays a key role, and removing the Euclidean loss substantially deteriorates the model performance.
AlignGroup: Learning and Aligning Group Consensus with Member Preferences for Group Recommendation
Xu, Jinfeng, Chen, Zheyu, Li, Jinze, Yang, Shuo, Wang, Hewei, Ngai, Edith C. -H.
Group activities are important behaviors in human society, providing personalized recommendations for groups is referred to as the group recommendation task. Existing methods can usually be categorized into two strategies to infer group preferences: 1) determining group preferences by aggregating members' personalized preferences, and 2) inferring group consensus by capturing group members' coherent decisions after common compromises. However, the former would suffer from the lack of group-level considerations, and the latter overlooks the fine-grained preferences of individual users. To this end, we propose a novel group recommendation method AlignGroup, which focuses on both group consensus and individual preferences of group members to infer the group decision-making. Specifically, AlignGroup explores group consensus through a well-designed hypergraph neural network that efficiently learns intra- and inter-group relationships. Moreover, AlignGroup innovatively utilizes a self-supervised alignment task to capture fine-grained group decision-making by aligning the group consensus with members' common preferences. Extensive experiments on two real-world datasets validate that our AlignGroup outperforms the state-of-the-art on both the group recommendation task and the user recommendation task, as well as outperforms the efficiency of most baselines.
How to Watch Apple's iPhone 16 Launch Event, and What to Expect
If tech news is feeling a little repeaty, with new phones and gadgets arriving in a seemingly endless stream over the last few weeks, know that it's not Groundhog Day. But it is almost time for yet another Apple event where new hardware will show up. At an event at its company headquarters next week, Apple will unveil the iPhone 16, as well as the next Apple Watch and (most likely) some AirPods. But like most tech events these days, much of the presentation is likely to revolve around artificial intelligence. The promotional image for next Monday's event is an Apple logo wrapped in a colorful glow with all the shades commonly used for Siri, Apple's digital assistant.
Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations
Ren, Zhicheng, Xiao, Zhiping, Sun, Yizhou
With the rapid development of social media, the importance of analyzing social network user data has also been put on the agenda. User representation learning in social media is a critical area of research, based on which we can conduct personalized content delivery, or detect malicious actors. Being more complicated than many other types of data, social network user data has inherent multimodal nature. Various multimodal approaches have been proposed to harness both text (i.e. post content) and relation (i.e. inter-user interaction) information to learn user embeddings of higher quality. The advent of Graph Neural Network models enables more end-to-end integration of user text embeddings and user interaction graphs in social networks. However, most of those approaches do not adequately elucidate which aspects of the data - text or graph structure information - are more helpful for predicting each specific user under a particular task, putting some burden on personalized downstream analysis and untrustworthy information filtering. We propose a simple yet effective framework called Contribution-Aware Multimodal User Embedding (CAMUE) for social networks. We have demonstrated with empirical evidence, that our approach can provide personalized explainable predictions, automatically mitigating the impact of unreliable information. We also conducted case studies to show how reasonable our results are. We observe that for most users, graph structure information is more trustworthy than text information, but there are some reasonable cases where text helps more. Our work paves the way for more explainable, reliable, and effective social media user embedding which allows for better personalized content delivery.
Laser: Parameter-Efficient LLM Bi-Tuning for Sequential Recommendation with Collaborative Information
Zhang, Xinyu, Hu, Linmei, Zhang, Luhao, Song, Dandan, Huang, Heyan, Nie, Liqiang
Sequential recommender systems are essential for discerning user preferences from historical interactions and facilitating targeted recommendations. Recent innovations employing Large Language Models (LLMs) have advanced the field by encoding item semantics, yet they often necessitate substantial parameter tuning and are resource-demanding. Moreover, these works fails to consider the diverse characteristics of different types of users and thus diminishes the recommendation accuracy. In this paper, we propose a parameter-efficient Large Language Model Bi-Tuning framework for sequential recommendation with collaborative information (Laser). Specifically, Bi-Tuning works by inserting trainable virtual tokens at both the prefix and suffix of the input sequence and freezing the LLM parameters, thus optimizing the LLM for the sequential recommendation. In our Laser, the prefix is utilized to incorporate user-item collaborative information and adapt the LLM to the recommendation task, while the suffix converts the output embeddings of the LLM from the language space to the recommendation space for the follow-up item recommendation. Furthermore, to capture the characteristics of different types of users when integrating the collaborative information via the prefix, we introduce M-Former, a lightweight MoE-based querying transformer that uses a set of query experts to integrate diverse user-specific collaborative information encoded by frozen ID-based sequential recommender systems, significantly improving the accuracy of recommendations. Extensive experiments on real-world datasets demonstrate that Laser can parameter-efficiently adapt LLMs to effective recommender systems, significantly outperforming state-of-the-art methods.
Foundation Models for Music: A Survey
Ma, Yinghao, Øland, Anders, Ragni, Anton, Del Sette, Bleiz MacSen, Saitis, Charalampos, Donahue, Chris, Lin, Chenghua, Plachouras, Christos, Benetos, Emmanouil, Shatri, Elona, Morreale, Fabio, Zhang, Ge, Fazekas, György, Xia, Gus, Zhang, Huan, Manco, Ilaria, Huang, Jiawen, Guinot, Julien, Lin, Liwei, Marinelli, Luca, Lam, Max W. Y., Sharma, Megha, Kong, Qiuqiang, Dannenberg, Roger B., Yuan, Ruibin, Wu, Shangda, Wu, Shih-Lun, Dai, Shuqi, Lei, Shun, Kang, Shiyin, Dixon, Simon, Chen, Wenhu, Huang, Wenhao, Du, Xingjian, Qu, Xingwei, Tan, Xu, Li, Yizhi, Tian, Zeyue, Wu, Zhiyong, Wu, Zhizheng, Ma, Ziyang, Wang, Ziyu
In recent years, foundation models (FMs) such as large language models (LLMs) and latent diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music, spanning from representation learning, generative learning and multimodal learning. We first contextualise the significance of music in various industries and trace the evolution of AI in music. By delineating the modalities targeted by foundation models, we discover many of the music representations are underexplored in FM development. Then, emphasis is placed on the lack of versatility of previous methods on diverse music applications, along with the potential of FMs in music understanding, generation and medical application. By comprehensively exploring the details of the model pre-training paradigm, architectural choices, tokenisation, finetuning methodologies and controllability, we emphasise the important topics that should have been well explored, like instruction tuning and in-context learning, scaling law and emergent ability, as well as long-sequence modelling etc. A dedicated section presents insights into music agents, accompanied by a thorough analysis of datasets and evaluations essential for pre-training and downstream tasks. Finally, by underscoring the vital importance of ethical considerations, we advocate that following research on FM for music should focus more on such issues as interpretability, transparency, human responsibility, and copyright issues. The paper offers insights into future challenges and trends on FMs for music, aiming to shape the trajectory of human-AI collaboration in the music realm.