Personal Assistant Systems
Context Retrieval via Normalized Contextual Latent Interaction for Conversational Agent
Liu, Junfeng, Mei, Zhuocheng, Peng, Kewen, Vatsavai, Ranga Raju
Conversational agents leveraging AI, particularly deep learning, are emerging in both academic research and real-world applications. However, these applications still face challenges, including disrespecting knowledge and facts, not personalizing to user preferences, and enormous demand for computational resources during training and inference. Recent research efforts have been focused on addressing these challenges from various aspects, including supplementing various types of auxiliary information to the conversational agents. However, existing methods are still not able to effectively and efficiently exploit relevant information from these auxiliary supplements to further unleash the power of the conversational agents and the language models they use. In this paper, we present a novel method, PK-NCLI, that is able to accurately and efficiently identify relevant auxiliary information to improve the quality of conversational responses by learning the relevance among persona, chat history, and knowledge background through low-level normalized contextual latent interaction. Our experimental results indicate that PK-NCLI outperforms the state-of-the-art method, PK-FoCus, by 47.80%/30.61%/24.14% in terms of perplexity, knowledge grounding, and training efficiency, respectively, and maintained the same level of persona grounding performance. We also provide a detailed analysis of how different factors, including language model choices and trade-offs on training weights, would affect the performance of PK-NCLI.
MuseChat: A Conversational Music Recommendation System for Videos
Dong, Zhikang, Chen, Bin, Liu, Xiulong, Polak, Pawel, Zhang, Peng
Music recommendation for videos attracts growing interest in multi-modal research. However, existing systems focus primarily on content compatibility, often ignoring the users' preferences. Their inability to interact with users for further refinements or to provide explanations leads to a less satisfying experience. We address these issues with MuseChat, a first-of-its-kind dialogue-based recommendation system that personalizes music suggestions for videos. Our system consists of two key functionalities with associated modules: recommendation and reasoning. The recommendation module takes a video along with optional information including previous suggested music and user's preference as inputs and retrieves an appropriate music matching the context. The reasoning module, equipped with the power of Large Language Model (Vicuna-7B) and extended to multi-modal inputs, is able to provide reasonable explanation for the recommended music. To evaluate the effectiveness of MuseChat, we build a large-scale dataset, conversational music recommendation for videos, that simulates a two-turn interaction between a user and a recommender based on accurate music track information. Experiment results show that MuseChat achieves significant improvements over existing video-based music retrieval methods as well as offers strong interpretability and interactability.
Multiple Testing of Linear Forms for Noisy Matrix Completion
Ma, Wanteng, Du, Lilun, Xia, Dong, Yuan, Ming
See, e.g., Resnick and Varian (1997); Schafer et al. (2007); Koren et al. (2009); Davidson et al. (2010); McAuley and Leskovec (2013); Das et al. (2017). Consider, more specifically, representing the ratings of d 1 users on d 2 products/items by a d 1 d 2 matrix. For all practical purposes, both d 1 and d 2 can be very large yet only a rather small number of the entries can be observed. The idea is that if the interaction between users and products can be approximately captured by a handful of latent user-specific and product-specific characteristics, then it is possible to infer the whole user-item rating matrix from these sparsely observed entries, and hence recommend products to users who may be genuinely interested in them. Since the pioneering works of Cand` es and Tao (2009); Candes and Plan (2010); Candes and Recht (2012), a lot of impressive progress has been made to make these techniques more accurate and scalable, and to better understand the statistical and computational underpinnings of the problem.
Poisoning Attacks Against Contrastive Recommender Systems
Wang, Zongwei, Yu, Junliang, Gao, Min, Yin, Hongzhi, Cui, Bin, Sadiq, Shazia
Contrastive learning (CL) has recently gained significant popularity in the field of recommendation. Its ability to learn without heavy reliance on labeled data is a natural antidote to the data sparsity issue. Previous research has found that CL can not only enhance recommendation accuracy but also inadvertently exhibit remarkable robustness against noise. However, this paper identifies a vulnerability of CL-based recommender systems: Compared with their non-CL counterparts, they are even more susceptible to poisoning attacks that aim to promote target items. Our analysis points to the uniform dispersion of representations led by the CL loss as the very factor that accounts for this vulnerability. We further theoretically and empirically demonstrate that the optimization of CL loss can lead to smooth spectral values of representations. Based on these insights, we attempt to reveal the potential poisoning attacks against CL-based recommender systems. The proposed attack encompasses a dual-objective framework: One that induces a smoother spectral value distribution to amplify the CL loss's inherent dispersion effect, named dispersion promotion; and the other that directly elevates the visibility of target items, named rank promotion. We validate the destructiveness of our attack model through extensive experimentation on four datasets. By shedding light on these vulnerabilities, we aim to facilitate the development of more robust CL-based recommender systems.
Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation
Su, Liangcai, Yan, Fan, Zhu, Jieming, Xiao, Xi, Duan, Haoyi, Zhao, Zhou, Dong, Zhenhua, Tang, Ruiming
Two-tower models are a prevalent matching framework for recommendation, which have been widely deployed in industrial applications. The success of two-tower matching attributes to its efficiency in retrieval among a large number of items, since the item tower can be precomputed and used for fast Approximate Nearest Neighbor (ANN) search. However, it suffers two main challenges, including limited feature interaction capability and reduced accuracy in online serving. Existing approaches attempt to design novel late interactions instead of dot products, but they still fail to support complex feature interactions or lose retrieval efficiency. To address these challenges, we propose a new matching paradigm named SparCode, which supports not only sophisticated feature interactions but also efficient retrieval. Specifically, SparCode introduces an all-to-all interaction module to model fine-grained query-item interactions. Besides, we design a discrete code-based sparse inverted index jointly trained with the model to achieve effective and efficient model inference. Extensive experiments have been conducted on open benchmark datasets to demonstrate the superiority of our framework. The results show that SparCode significantly improves the accuracy of candidate item matching while retaining the same level of retrieval efficiency with two-tower models. Our source code will be available at MindSpore/models.
CommunityAI: Towards Community-based Federated Learning
Murturi, Ilir, Donta, Praveen Kumar, Dustdar, Schahram
Federated Learning (FL) has emerged as a promising paradigm to train machine learning models collaboratively while preserving data privacy. However, its widespread adoption faces several challenges, including scalability, heterogeneous data and devices, resource constraints, and security concerns. Despite its promise, FL has not been specifically adapted for community domains, primarily due to the wide-ranging differences in data types and context, devices and operational conditions, environmental factors, and stakeholders. In response to these challenges, we present a novel framework for Community-based Federated Learning called CommunityAI. CommunityAI enables participants to be organized into communities based on their shared interests, expertise, or data characteristics. Community participants collectively contribute to training and refining learning models while maintaining data and participant privacy within their respective groups. Within this paper, we discuss the conceptual architecture, system requirements, processes, and future challenges that must be solved. Finally, our goal within this paper is to present our vision regarding enabling a collaborative learning process within various communities.
Finnish 5th and 6th graders' misconceptions about Artificial Intelligence
Mertala, Pekka, Fagerlund, Janne
Research on children's initial conceptions of AI is in an emerging state, which, from a constructivist viewpoint, challenges the development of pedagogically sound AI-literacy curricula, methods, and materials. To contribute to resolving this need in the present paper, qualitative survey data from 195 children were analyzed abductively to answer the following three research questions: What kind of misconceptions do Finnish 5th and 6th graders' have about the essence AI?; 2) How do these misconceptions relate to common misconception types?; and 3) How profound are these misconceptions? As a result, three misconception categories were identified: 1) Non-technological AI, in which AI was conceptualized as peoples' cognitive processes (factual misconception); 2) Anthropomorphic AI, in which AI was conceptualized as a human-like entity (vernacular, non-scientific, and conceptual misconception); and 3) AI as a machine with a pre-installed intelligence or knowledge (factual misconception). Majority of the children evaluated their AI-knowledge low, which implies that the misconceptions are more superficial than profound. The findings suggest that context-specific linguistic features can contribute to students' AI misconceptions. Implications for future research and AI literacy education are discussed.
Edge AI for Internet of Energy: Challenges and Perspectives
Himeur, Yassine, Sayed, Aya Nabil, Alsalemi, Abdullah, Bensaali, Faycal, Amira, Abbes
The digital landscape of the Internet of Energy (IoE) is on the brink of a revolutionary transformation with the integration of edge Artificial Intelligence (AI). This comprehensive review elucidates the promise and potential that edge AI holds for reshaping the IoE ecosystem. Commencing with a meticulously curated research methodology, the article delves into the myriad of edge AI techniques specifically tailored for IoE. The myriad benefits, spanning from reduced latency and real-time analytics to the pivotal aspects of information security, scalability, and cost-efficiency, underscore the indispensability of edge AI in modern IoE frameworks. As the narrative progresses, readers are acquainted with pragmatic applications and techniques, highlighting on-device computation, secure private inference methods, and the avant-garde paradigms of AI training on the edge. A critical analysis follows, offering a deep dive into the present challenges including security concerns, computational hurdles, and standardization issues. However, as the horizon of technology ever expands, the review culminates in a forward-looking perspective, envisaging the future symbiosis of 5G networks, federated edge AI, deep reinforcement learning, and more, painting a vibrant panorama of what the future beholds. For anyone vested in the domains of IoE and AI, this review offers both a foundation and a visionary lens, bridging the present realities with future possibilities.
Hyper-Relational Knowledge Graph Neural Network for Next POI
Zhang, Jixiao, Li, Yongkang, Zou, Ruotong, Zhang, Jingyuan, Fan, Zipei, Song, Xuan
With the advancement of mobile technology, Point of Interest (POI) recommendation systems in Location-based Social Networks (LBSN) have brought numerous benefits to both users and companies. Many existing works employ Knowledge Graph (KG) to alleviate the data sparsity issue in LBSN. These approaches primarily focus on modeling the pair-wise relations in LBSN to enrich the semantics and thereby relieve the data sparsity issue. However, existing approaches seldom consider the hyper-relations in LBSN, such as the mobility relation (a 3-ary relation: user-POI-time). This makes the model hard to exploit the semantics accurately. In addition, prior works overlook the rich structural information inherent in KG, which consists of higher-order relations and can further alleviate the impact of data sparsity.To this end, we propose a Hyper-Relational Knowledge Graph Neural Network (HKGNN) model. In HKGNN, a Hyper-Relational Knowledge Graph (HKG) that models the LBSN data is constructed to maintain and exploit the rich semantics of hyper-relations. Then we proposed a Hypergraph Neural Network to utilize the structural information of HKG in a cohesive way. In addition, a self-attention network is used to leverage sequential information and make personalized recommendations. Furthermore, side information, essential in reducing data sparsity by providing background knowledge of POIs, is not fully utilized in current methods. In light of this, we extended the current dataset with available side information to further lessen the impact of data sparsity. Results of experiments on four real-world LBSN datasets demonstrate the effectiveness of our approach compared to existing state-of-the-art methods.