Personal Assistant Systems
Diversify and Conquer: Bandits and Diversity for an Enhanced E-commerce Homepage Experience
Jaiswal, Sangeet, Malayil, Korah T, Jawaid, Saif, Vempati, Sreekanth
In the realm of e-commerce, popular platforms utilize widgets to recommend advertisements and products to their users. However, the prevalence of mobile device usage on these platforms introduces a unique challenge due to the limited screen real estate available. Consequently, the positioning of relevant widgets becomes pivotal in capturing and maintaining customer engagement. Given the restricted screen size of mobile devices, widgets placed at the top of the interface are more prominently displayed and thus attract greater user attention. Conversely, widgets positioned further down the page require users to scroll, resulting in reduced visibility and subsequent lower impression rates. Therefore it becomes imperative to place relevant widgets on top. However, selecting relevant widgets to display is a challenging task as the widgets can be heterogeneous, widgets can be introduced or removed at any given time from the platform. In this work, we model the vertical widget reordering as a contextual multi-arm bandit problem with delayed batch feedback. The objective is to rank the vertical widgets in a personalized manner. We present a two-stage ranking framework that combines contextual bandits with a diversity layer to improve the overall ranking. We demonstrate its effectiveness through offline and online A/B results, conducted on proprietary data from Myntra, a major fashion e-commerce platform in India.
Borda Regret Minimization for Generalized Linear Dueling Bandits
Wu, Yue, Jin, Tao, Lou, Hao, Farnoud, Farzad, Gu, Quanquan
Dueling bandits are widely used to model preferential feedback prevalent in many applications such as recommendation systems and ranking. In this paper, we study the Borda regret minimization problem for dueling bandits, which aims to identify the item with the highest Borda score while minimizing the cumulative regret. We propose a rich class of generalized linear dueling bandit models, which cover many existing models. We first prove a regret lower bound of order $\Omega(d^{2/3} T^{2/3})$ for the Borda regret minimization problem, where $d$ is the dimension of contextual vectors and $T$ is the time horizon. To attain this lower bound, we propose an explore-then-commit type algorithm for the stochastic setting, which has a nearly matching regret upper bound $\tilde{O}(d^{2/3} T^{2/3})$. We also propose an EXP3-type algorithm for the adversarial linear setting, where the underlying model parameter can change at each round. Our algorithm achieves an $\tilde{O}(d^{2/3} T^{2/3})$ regret, which is also optimal. Empirical evaluations on both synthetic data and a simulated real-world environment are conducted to corroborate our theoretical analysis.
Related Rhythms: Recommendation System To Discover Music You May Like
Singh, Rahul, Kanuparthi, Pranav
Machine Learning models are being utilized extensively to drive recommender systems, which is a widely explored topic today. This is especially true of the music industry, where we are witnessing a surge in growth. Besides a large chunk of active users, these systems are fueled by massive amounts of data. These large-scale systems yield applications that aim to provide a better user experience and to keep customers actively engaged. In this paper, a distributed Machine Learning (ML) pipeline is delineated, which is capable of taking a subset of songs as input and producing a new subset of songs identified as being similar to the inputted subset. The publicly accessible Million Songs Dataset (MSD) enables researchers to develop and explore reasonably efficient systems for audio track analysis and recommendations, without having to access a commercialized music platform. The objective of the proposed application is to leverage an ML system trained to optimally recommend songs that a user might like.
Beyond Deep Fakes
Within the next five years, the way we work, live, play, and learn will be changed by digital humans (chatbots and avatars with very realistic human faces). Digital humans are already gaining popularity as social media influencers, and they will soon evolve into digital sales assistants, fashion advisers, and personal shoppers able to model how customers will look and move in the latest ensembles. Digital humans will become central to the multibillion-dollar fashion industry, as social media is further integrated into the retail customer experience. Digital humans will also help in healthcare, enabling medical students and social workers to develop better interview skills for patients in sensitive clinical settings. They will allow people, especially those with mental health challenges, to rehearse for job interviews. They will help keep elderly people connected to their communities and respectfully monitored so they can remain in their homes longer. They will provide a human face for personalized advice, support, and training--and do it at scale. This has become possible with the advent of cost-effective, highly realistic, personalized interactive digital agents and avatars sporting high-fidelity facial simulations powered by advances in both real-time neural rendering (NR) and low-latency computing. NR refers to the use of machine-learning (ML) techniques to generate digital faces or face replacements in video.17 NR rose to prominence with the advent of so-called "deep fakes"--the replacement of someone's face in videos with an NR-generated face of remarkable realism. The term originates from the name of a Reddit user (/u/deepfakes), a ML engineer who posted the original deep fake auto-encoder. Often used for satire, deep fakes can be harmful, presenting novel ethical issues. The best-known examples involve deep fakes of celebrities, a form of face "hijacking" whereby publicly available videos of a person are used to train an ML program that overlays the source person's face onto existing video footage; this technique was originally used in pornographic material.
Designing a Framework for Conversational Interfaces
The conversational interface is an idea that is forever on the cusp of transforming the world. The potential is undeniable: Everyone has innate, untapped conversational expertise. We could do away with the nested menus required by visual interfaces; anything the user can name is immediately at hand. We could turn natural language into a declarative scripting language and operating systems into integrated development environments (IDEs). Reality, however, has not lived up to this potential.
QFA2SR: Query-Free Adversarial Transfer Attacks to Speaker Recognition Systems
Chen, Guangke, Zhang, Yedi, Zhao, Zhe, Song, Fu
Current adversarial attacks against speaker recognition systems (SRSs) require either white-box access or heavy black-box queries to the target SRS, thus still falling behind practical attacks against proprietary commercial APIs and voice-controlled devices. To fill this gap, we propose QFA2SR, an effective and imperceptible query-free black-box attack, by leveraging the transferability of adversarial voices. To improve transferability, we present three novel methods, tailored loss functions, SRS ensemble, and time-freq corrosion. The first one tailors loss functions to different attack scenarios. The latter two augment surrogate SRSs in two different ways. SRS ensemble combines diverse surrogate SRSs with new strategies, amenable to the unique scoring characteristics of SRSs. Time-freq corrosion augments surrogate SRSs by incorporating well-designed time-/frequency-domain modification functions, which simulate and approximate the decision boundary of the target SRS and distortions introduced during over-the-air attacks. QFA2SR boosts the targeted transferability by 20.9%-70.7% on four popular commercial APIs (Microsoft Azure, iFlytek, Jingdong, and TalentedSoft), significantly outperforming existing attacks in query-free setting, with negligible effect on the imperceptibility. QFA2SR is also highly effective when launched over the air against three wide-spread voice assistants (Google Assistant, Apple Siri, and TMall Genie) with 60%, 46%, and 70% targeted transferability, respectively.
Smarter AI Assistants Could Make It Harder to Stay Human
Researchers and futurists have been talking for decades about the day when intelligent software agents will act as personal assistants, tutors, and advisers. Apple produced its famous Knowledge Navigator video in 1987. I seem to remember attending an MIT Media Lab event in the 1990s about software agents, where the moderator appeared as a butler, in a bowler hat. With the advent of generative AI, that gauzy vision of software as aide-de-camp has suddenly come into focus. WIRED's Will Knight provided an overview this week of what's available now and what's imminent.
Diffusion Augmentation for Sequential Recommendation
Liu, Qidong, Yan, Fan, Zhao, Xiangyu, Du, Zhaocheng, Guo, Huifeng, Tang, Ruiming, Tian, Feng
Sequential recommendation (SRS) has become the technical foundation in many applications recently, which aims to recommend the next item based on the user's historical interactions. However, sequential recommendation often faces the problem of data sparsity, which widely exists in recommender systems. Besides, most users only interact with a few items, but existing SRS models often underperform these users. Such a problem, named the long-tail user problem, is still to be resolved. Data augmentation is a distinct way to alleviate these two problems, but they often need fabricated training strategies or are hindered by poor-quality generated interactions. To address these problems, we propose a Diffusion Augmentation for Sequential Recommendation (DiffuASR) for a higher quality generation. The augmented dataset by DiffuASR can be used to train the sequential recommendation models directly, free from complex training procedures. To make the best of the generation ability of the diffusion model, we first propose a diffusion-based pseudo sequence generation framework to fill the gap between image and sequence generation. Then, a sequential U-Net is designed to adapt the diffusion noise prediction model U-Net to the discrete sequence generation task. At last, we develop two guide strategies to assimilate the preference between generated and origin sequences. To validate the proposed DiffuASR, we conduct extensive experiments on three real-world datasets with three sequential recommendation models. The experimental results illustrate the effectiveness of DiffuASR. As far as we know, DiffuASR is one pioneer that introduce the diffusion model to the recommendation.
Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models
Boutilier, Craig, Mladenov, Martin, Tennenholtz, Guy
Modern recommender systems lie at the heart of complex ecosystems that couple the behavior of users, content providers, advertisers, and other actors. Despite this, the focus of the majority of recommender research -- and most practical recommenders of any import -- is on the local, myopic optimization of the recommendations made to individual users. This comes at a significant cost to the long-term utility that recommenders could generate for its users. We argue that explicitly modeling the incentives and behaviors of all actors in the system -- and the interactions among them induced by the recommender's policy -- is strictly necessary if one is to maximize the value the system brings to these actors and improve overall ecosystem "health". Doing so requires: optimization over long horizons using techniques such as reinforcement learning; making inevitable tradeoffs in the utility that can be generated for different actors using the methods of social choice; reducing information asymmetry, while accounting for incentives and strategic behavior, using the tools of mechanism design; better modeling of both user and item-provider behaviors by incorporating notions from behavioral economics and psychology; and exploiting recent advances in generative and foundation models to make these mechanisms interpretable and actionable. We propose a conceptual framework that encompasses these elements, and articulate a number of research challenges that emerge at the intersection of these different disciplines.
Everything Amazon announced at its 2023 Devices and Services event
Amazon's fall hardware event was chock full of updates. Perhaps unsurprisingly, given the generative AI boom from the last year, the company began transforming Alexa into a much more versatile and conversational personal chatbot. But it also had plenty of new hardware to introduce, with new models of the Echo Show, security cameras, Echo Frames, a 10-gigabit router and more. Here's everything Amazon unveiled on Wednesday. As generative AI has exploded in popularity during the last year, task-focused personal assistants like Siri, Google Assistant and Alexa now seem even more dated than they did before.