Personal Assistant Systems
Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation
Tian, Zhen, Bai, Ting, Zhang, Zibin, Xu, Zhiyuan, Lin, Kangyi, Wen, Ji-Rong, Zhao, Wayne Xin
With the growth of high-dimensional sparse data in web-scale recommender systems, the computational cost to learn high-order feature interaction in CTR prediction task largely increases, which limits the use of high-order interaction models in real industrial applications. Some recent knowledge distillation based methods transfer knowledge from complex teacher models to shallow student models for accelerating the online model inference. However, they suffer from the degradation of model accuracy in knowledge distillation process. It is challenging to balance the efficiency and effectiveness of the shallow student models. To address this problem, we propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. The proposed lightweight student model DAGFM can learn arbitrary explicit feature interactions from teacher networks, which achieves approximately lossless performance and is proved by a dynamic programming algorithm. Besides, an improved general model KD-DAGFM+ is shown to be effective in distilling both explicit and implicit feature interactions from any complex teacher model. Extensive experiments are conducted on four real-world datasets, including a large-scale industrial dataset from WeChat platform with billions of feature dimensions. KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments, showing the superiority of DAGFM to deal with the industrial scale data in CTR prediction task. Our implementation code is available at: https://github.com/RUCAIBox/DAGFM.
In conversation with Artificial Intelligence: aligning language models with human values
Kasirzadeh, Atoosa, Gabriel, Iason
Large-scale language technologies are increasingly used in various forms of communication with humans across different contexts. One particular use case for these technologies is conversational agents, which output natural language text in response to prompts and queries. This mode of engagement raises a number of social and ethical questions. For example, what does it mean to align conversational agents with human norms or values? Which norms or values should they be aligned with? And how can this be accomplished? In this paper, we propose a number of steps that help answer these questions. We start by developing a philosophical analysis of the building blocks of linguistic communication between conversational agents and human interlocutors. We then use this analysis to identify and formulate ideal norms of conversation that can govern successful linguistic communication between humans and conversational agents. Furthermore, we explore how these norms can be used to align conversational agents with human values across a range of different discursive domains. We conclude by discussing the practical implications of our proposal for the design of conversational agents that are aligned with these norms and values.
TikTok will explain why it recommends videos on its 'For You' page
The algorithm that powers TikTok's "For You" page has long been a source of fascination and suspicion. Fans often remark on the app's eerie accuracy, while TikTok critics have at times speculated the company could subtly manipulate its algorithm to influence its users in more nefarious ways. Now, the company is taking new steps to demystify some aspects of its algorithm. The app is introducing a feature that will "help people understand why a particular video has been recommended to them." With the update, users will be able to tap on a new question mark icon, which will list some factors that played a role in the recommendation.
Multi-Metric AutoRec for High Dimensional and Sparse User Behavior Data Prediction
Liang, Cheng, Huang, Teng, He, Yi, Deng, Song, Wu, Di, Luo, Xin
User behavior data produced during interaction with massive items in the significant data era are generally heterogeneous and sparse, leaving the recommender system (RS) a large diversity of underlying patterns to excavate. Deep neural network-based models have reached the state-of-the-art benchmark of the RS owing to their fitting capabilities. However, prior works mainly focus on designing an intricate architecture with fixed loss function and regulation. These single-metric models provide limited performance when facing heterogeneous and sparse user behavior data. Motivated by this finding, we propose a multi-metric AutoRec (MMA) based on the representative AutoRec. The idea of the proposed MMA is mainly two-fold: 1) apply different $L_p$-norm on loss function and regularization to form different variant models in different metric spaces, and 2) aggregate these variant models. Thus, the proposed MMA enjoys the multi-metric orientation from a set of dispersed metric spaces, achieving a comprehensive representation of user data. Theoretical studies proved that the proposed MMA could attain performance improvement. The extensive experiment on five real-world datasets proves that MMA can outperform seven other state-of-the-art models in predicting unobserved user behavior data.
Addressing the Selection Bias in Voice Assistance: Training Voice Assistance Model in Python with Equal Data Selection
Piya, Kashav, Shrestha, Srijal, Frank, Cameran, Jebessa, Estephanos, Mohd, Tauheed Khan
In recent times, voice assistants have become a part of our day-to-day lives, allowing information retrieval by voice synthesis, voice recognition, and natural language processing. These voice assistants can be found in many modern-day devices such as Apple, Amazon, Google, and Samsung. This project is primarily focused on Virtual Assistance in Natural Language Processing. Natural Language Processing is a form of AI that helps machines understand people and create feedback loops. This project will use deep learning to create a Voice Recognizer and use Commonvoice and data collected from the local community for model training using Google Colaboratory. After recognizing a command, the AI assistant will be able to perform the most suitable actions and then give a response. The motivation for this project comes from the race and gender bias that exists in many virtual assistants. The computer industry is primarily dominated by the male gender, and because of this, many of the products produced do not regard women. This bias has an impact on natural language processing. This project will be utilizing various open-source projects to implement machine learning algorithms and train the assistant algorithm to recognize different types of voices, accents, and dialects. Through this project, the goal to use voice data from underrepresented groups to build a voice assistant that can recognize voices regardless of gender, race, or accent. Increasing the representation of women in the computer industry is important for the future of the industry. By representing women in the initial study of voice assistants, it can be shown that females play a vital role in the development of this technology. In line with related work, this project will use first-hand data from the college population and middle-aged adults to train voice assistant to combat gender bias.
Next Period Recommendation Reality Check
Kolesnikov, Sergey, Lashinin, Oleg, Pechatov, Michail, Kosov, Alexander
Over the past decade, tremendous progress has been made in Recommender Systems (RecSys) for well-known tasks such as next-item and next-basket prediction. On the other hand, the recently proposed next-period recommendation (NPR) task is not covered as much. Current works about NPR are mostly based around distinct problem formulations, methods, and proprietary datasets, making solutions difficult to reproduce. In this article, we aim to fill the gap in RecSys methods evaluation on the NPR task using publicly available datasets and (1) introduce the TTRS, a large-scale financial transactions dataset suitable for RecSys methods evaluation; (2) benchmark popular RecSys approaches on several datasets for the NPR task. When performing our analysis, we found a strong repetitive consumption pattern in several real-world datasets. With this setup, our results suggest that the repetitive nature of data is still hard to generalize for the evaluated RecSys methods, and novel item prediction performance is still questionable.
Current State of Artificial Intelligence
Artificial intelligence (AI) has come a long way in recent years and has made significant advances in a variety of fields. One of the most notable areas where AI has made significant progress is in machine learning, which allows computers to learn and adapt without being explicitly programmed. This has led to the development of many exciting and innovative applications, such as self-driving cars, voice recognition systems, and intelligent personal assistants. In addition to machine learning, AI has also made strides in natural language processing (NLP), which enables computers to understand and generate human-like language. This has led to the development of chatbots and virtual assistants that can converse with humans and respond to their questions and requests.
The power of human connection in decision-making in a data driven international student recruitment
We need career planners, and not just people to get admission! Thanks to the pandemic, which pushed the limits of online aggregators and EdTech companies in international student recruitment. We are witnessing technology slowly making a powerful impact on student recruitment; though the industry is yet to witness the full power of Artificial Intelligence and Automation. Building AI platforms are going to be cheaper and replicating a technology model doesn't require a big innovation. The industry is going to be dumped with too much data for recruiters and students.
Discover the Top 10 Ways Artificial Intelligence is Revolutionizing Our World
Artificial intelligence (AI) is rapidly changing the way we live and work. From self-driving cars and personalized healthcare to virtual assistants and improved manufacturing processes, AI is transforming industries and improving our daily lives. One of the biggest impacts of AI is in the field of automation. With machine learning algorithms, computers can now perform tasks that were previously only possible for humans to do. This is particularly evident in manufacturing, where AI is being used to optimize production lines and reduce the need for human labor.
GitHub - jiwidi/MASTER_THESIS: Master thesis in collaboration with H&M
This thesis shows how NLP Deep Learning methods, trained on user interactions sequences at H&M website, can be used to model user behavior and create personalized recommendations. We performed multiple experiments to prove how an ordered user history helps the model learn. Both item and user representations proved themselves to play an important role in our model's performance. New models won at performance but also saw different patterns on recommended items, recommending less popular items, and more expensive than our baseline model. We believe it is the powerful representation learning and the ability to capture order within sequences that are responsible for the performance improvements.