Goto

Collaborating Authors

 Personal Assistant Systems


The Age of Chat

The New Yorker

Earlier this spring, I took the bus to the Moscone Center, in downtown San Francisco, where almost thirty thousand people had gathered for the annual Game Developers Conference (G.D.C.), which I was attending as a journalist. I had spent the previous few months out on maternity leave, and I was glad to return to work, to have meetings, to temporarily exit the domestic sphere. Participating in public life felt incredible, almost psychedelic. I loved making small talk with the bus driver, and eavesdropping on strangers. "Conferences are back," I heard one man say, sombrely, to another.


Pacos: Modeling Users' Interpretable and Context-Dependent Choices in Preference Reversals

arXiv.org Artificial Intelligence

Choice problems refer to the problem of selecting the best choices from several available items, and learning users' preferences in choice problems is of great importance in understanding users' decision making mechanisms and providing personalized services. Existing works typically assume that people evaluate items independently. In practice, however, users' preferences depend on the market in which items are placed, which is known as the context effects; and the order of users' preferences for two items may even be reversed, which is called to preference reversals. In this work, we identify three factors contributing to the context effects: users' adaptive weights, the inter-item comparison, and display positions. We propose a context-dependent preference model named Pacos as a unified framework to address three factors simultaneously, and consider two design methods including an additive method with high interpretability and an ANN-based method with high accuracy. We study the conditions for preference reversals to occur and provide a theoretical proof of the effectiveness of Pacos in predicting when preference reversals would occur. Experimental results show that the proposed method has better performance than prior works in predicting users' choices, and has great interpretability to help understand the cause of preference reversals. Choice problems, such as purchasing a festival gift or picking a restaurant, involve comparing several available items. Previous works on preference modeling and analysis typically assume that people evaluate items independently, and the relative preference between two items is fixed regardless of other competing options [1]. However, numerous studies show that the above independence assumption is frequently violated in reality [2], [3]. It is essential to model how the relative preference is influenced by competing options and figure out how people select their best choices. This study can help understand users' decision making mechanisms and offer personalized services, and provide important guidelines on pricing strategies and sales forecasts. To show this independence violation, we conduct a real user test. In our test, we set two markets of Xiaomi scale, as shown in Figure 1 (a) and (b). In these two markets, we consider sellers described by two attributes: price (¥) and seller reputation (REP).


Amazon's Echo Dot is down to $28, plus the rest of this week's best tech deals

Engadget

Summer can be a sleepy time for deals, but there was actually a fair amount of savings to be found on tech this week. Amazon's Prime Day is probably about a month away, but the company looked like they were getting a head start with discounts on Kindles, two Echo speakers, Fire TV devices and Blink mini cameras. Those prices may go lower during the event, but the savings are still good if you can't wait. Our favorite Sony headphones dropped back down to $348 and a few different Beats earbuds, including the Powerbeats Pro saw discounts of up to 36 percent. Apple's latest laptop, the 15-inch MacBook Air is already $100 off and last year's XPS 15 from Dell is currently $800 off. Here are the best deals from this week that you can still get today. Pair a smart speaker with a smart plug and you have the underpinnings of a smart home setup.


CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages

arXiv.org Artificial Intelligence

In this paper, we present CML-TTS, a recursive acronym for CML-Multi-Lingual-TTS, a new Text-to-Speech (TTS) dataset developed at the Center of Excellence in Artificial Intelligence (CEIA) of the Federal University of Goias (UFG). CML-TTS is based on Multilingual LibriSpeech (MLS) and adapted for training TTS models, consisting of audiobooks in seven languages: Dutch, French, German, Italian, Portuguese, Polish, and Spanish. Additionally, we provide the YourTTS model, a multi-lingual TTS model, trained using 3,176.13 hours from CML-TTS and also with 245.07 hours from LibriTTS, in English. Our purpose in creating this dataset is to open up new research possibilities in the TTS area for multi-lingual models. The dataset is publicly available under the CC-BY 4.0 license1.


AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets

arXiv.org Artificial Intelligence

High-quality data is essential for conversational recommendation systems and serves as the cornerstone of the network architecture development and training strategy design. Existing works contribute heavy human efforts to manually labeling or designing and extending recommender dialogue templates. However, they suffer from (i) the limited number of human annotators results in that datasets can hardly capture rich and large-scale cases in the real world, (ii) the limited experience and knowledge of annotators account for the uninformative corpus and inappropriate recommendations. In this paper, we propose a novel automatic dataset synthesis approach that can generate both large-scale and high-quality recommendation dialogues through a data2text generation process, where unstructured recommendation conversations are generated from structured graphs based on user-item information from the real world. In doing so, we comprehensively exploit: (i) rich personalized user profiles from traditional recommendation datasets, (ii) rich external knowledge from knowledge graphs, and (iii) the conversation ability contained in human-to-human conversational recommendation datasets. Extensive experiments validate the benefit brought by the automatically synthesized data under low-resource scenarios and demonstrate the promising potential to facilitate the development of a more effective conversational recommendation system.


Graph-Based Model-Agnostic Data Subsampling for Recommendation Systems

arXiv.org Artificial Intelligence

Data subsampling is widely used to speed up the training of large-scale recommendation systems. Most subsampling methods are model-based and often require a pre-trained pilot model to measure data importance via e.g. sample hardness. However, when the pilot model is misspecified, model-based subsampling methods deteriorate. Since model misspecification is persistent in real recommendation systems, we instead propose model-agnostic data subsampling methods by only exploring input data structure represented by graphs. Specifically, we study the topology of the user-item graph to estimate the importance of each user-item interaction (an edge in the user-item graph) via graph conductance, followed by a propagation step on the network to smooth out the estimated importance value. Since our proposed method is model-agnostic, we can marry the merits of both model-agnostic and model-based subsampling methods. Empirically, we show that combing the two consistently improves over any single method on the used datasets. Experimental results on KuaiRec and MIND datasets demonstrate that our proposed methods achieve superior results compared to baseline approaches.


Fairness in Matching under Uncertainty

arXiv.org Artificial Intelligence

Systems based on algorithms and machine learning are increasingly used to guide or outright make decisions which strongly impact human lives; thus it is imperative to take fairness into account when designing such systems. Notions of fairness in computer science can be classified into those that try to capture fairness towards a group (Hardt et al., 2016; Hébert-Johnson et al., 2018; Kearns et al., 2018; Kleinberg et al., 2017) vs. those that try to be fair to each individual Dwork et al. (2012); Kim et al. (2018, 2020). In our work, we focus on the latter notion. The most widely studied notion of individual fairness is due to the seminal work of Dwork et al. (2012): it assumes that a metric space on observable features of individuals captures similarity, and requires that outcomes of a resource allocation mechanism satisfy a certain Lipschitz continuity condition with respect to the given metric. Intuitively, this ensures that individuals who are similar according to the metric will be treated similarly by the mechanism. We consider a setting in which individuals have preferences over the outcomes of the resource allocation mechanism, focusing on the important setting of two-sided markets. Applications of this setting abound: matching students to schools, job fair participants to interviews, doctors to hospitals, patients to treatments, drivers to passengers in ride hailing, or advertisers to ad slots/users in online advertising (Abdulkadiroğlu and Sönmez, 2003; Bronfman et al., 2015; Mehta et al., 2013; Roth, 1986; Roth et al., 2007), to name a


Community Detection Attack against Collaborative Learning-based Recommender Systems

arXiv.org Artificial Intelligence

Collaborative-learning based recommender systems emerged following the success of collaborative learning techniques such as Federated Learning (FL) and Gossip Learning (GL). In these systems, users participate in the training of a recommender system while keeping their history of consumed items on their devices. While these solutions seemed appealing for preserving the privacy of the participants at a first glance, recent studies have shown that collaborative learning can be vulnerable to a variety of privacy attacks. In this paper we propose a novel privacy attack called Community Detection Attack (CDA), which allows an adversary to discover the members of a community based on a set of items of her choice (e.g., discovering users interested in LGBT content). Through experiments on three real recommendation datasets and by using two state-of-the-art recommendation models, we assess the sensitivity of an FL-based recommender system as well as two flavors of Gossip Learning-based recommender systems to CDA. Results show that on all models and all datasets, the FL setting is more vulnerable to CDA than Gossip settings. We further evaluated two off-the-shelf mitigation strategies, namely differential privacy (DP) and a share less policy, which consists in sharing a subset of model parameters. Results show a better privacy-utility trade-off for the share less policy compared to DP especially in the Gossip setting.


Improving Training Stability for Multitask Ranking Models in Recommender Systems

arXiv.org Artificial Intelligence

Recommender systems play an important role in many content platforms. While most recommendation research is dedicated to designing better models to improve user experience, we found that research on stabilizing the training for such models is severely under-explored. As recommendation models become larger and more sophisticated, they are more susceptible to training instability issues, i.e., loss divergence, which can make the model unusable, waste significant resources and block model developments. In this paper, we share our findings and best practices we learned for improving the training stability of a real-world multitask ranking model for YouTube recommendations. We show some properties of the model that lead to unstable training and conjecture on the causes. Furthermore, based on our observations of training dynamics near the point of training instability, we hypothesize why existing solutions would fail, and propose a new algorithm to mitigate the limitations of existing solutions. Our experiments on YouTube production dataset show the proposed algorithm can significantly improve training stability while not compromising convergence, comparing with several commonly used baseline methods.


Debiasing Recommendation by Learning Identifiable Latent Confounders

arXiv.org Artificial Intelligence

Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables (e.g., the socio-economic status of a user) that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. However, they cannot guarantee the identification of counterfactual feedback, which can lead to biased predictions. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables (e.g., observed user features) to resolve the aforementioned non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees. Extensive experiments on various real-world and synthetic datasets verify the proposed method's effectiveness and robustness.