Goto

Collaborating Authors

 Personal Assistant Systems


GUESR: A Global Unsupervised Data-Enhancement with Bucket-Cluster Sampling for Sequential Recommendation

arXiv.org Artificial Intelligence

Sequential Recommendation is a widely studied paradigm for learning users' dynamic interests from historical interactions for predicting the next potential item. Although lots of research work has achieved remarkable progress, they are still plagued by the common issues: data sparsity of limited supervised signals and data noise of accidentally clicking. To this end, several works have attempted to address these issues, which ignored the complex association of items across several sequences. Along this line, with the aim of learning representative item embedding to alleviate this dilemma, we propose GUESR, from the view of graph contrastive learning. Specifically, we first construct the Global Item Relationship Graph (GIRG) from all interaction sequences and present the Bucket-Cluster Sampling (BCS) method to conduct the sub-graphs. Then, graph contrastive learning on this reduced graph is developed to enhance item representations with complex associations from the global view. We subsequently extend the CapsNet module with the elaborately introduced target-attention mechanism to derive users' dynamic preferences. Extensive experimental results have demonstrated our proposed GUESR could not only achieve significant improvements but also could be regarded as a general enhancement strategy to improve the performance in combination with other sequential recommendation methods.


Active Reward Learning from Multiple Teachers

arXiv.org Artificial Intelligence

Reward learning algorithms utilize human feedback to infer a reward function, which is then used to train an AI system. This human feedback is often a preference comparison, in which the human teacher compares several samples of AI behavior and chooses which they believe best accomplishes the objective. While reward learning typically assumes that all feedback comes from a single teacher, in practice these systems often query multiple teachers to gather sufficient training data. In this paper, we investigate this disparity, and find that algorithmic evaluation of these different sources of feedback facilitates more accurate and efficient reward learning. We formally analyze the value of information (VOI) when reward learning from teachers with varying levels of rationality, and define and evaluate an algorithm that utilizes this VOI to actively select teachers to query for feedback. Surprisingly, we find that it is often more informative to query comparatively irrational teachers. By formalizing this problem and deriving an analytical solution, we hope to facilitate improvement in reward learning approaches to aligning AI behavior with human values.


People will have personal AI assistants, like ChatGPT: Web inventor Lee

#artificialintelligence

The inventor of the World Wide Web (also known as the Web), Tim Berners-Lee said that in the future, people will have their own personal AI assistant, similar to ChatGPT. In a recent episode of CNBC's Beyond the Valley podcast, Berners-Lee said that his new company envisions people having online'pods' where all of their personal data is stored. Inrupt, a startup co-founded by Berners-Lee, aims to provide web users with a single login that can be used across multiple websites. Inrupt intends to store individual users' data in digital containers as part of its work on developing that technology. The pods will be capable of granting websites or services access to some or all of a person's personal information, ranging from sleeping patterns to shopping preferences, reports Fortune.


The Future of Human Agency

#artificialintelligence

This report covers results from the 15th "Future of the Internet" canvassing that Pew Research Center and Elon University's Imagining the Internet Center have conducted together to gather expert views about important digital issues. This is a nonscientific canvassing based on a nonrandom sample; this broad array of opinions about the potential influence of current trends may lead between 2022 and 2035 represents only the points of view of the individuals who responded to the queries. Pew Research Center and Elon's Imagining the Internet Center sampled from a database of experts to canvass from a wide range of fields, inviting entrepreneurs, professionals and policy people based in government bodies, nonprofits and foundations, technology businesses and think tanks, as well as interested academics and technology innovators. The predictions reported here came in response to a set of questions in an online canvassing conducted between June 29 and Aug. 8, 2022. In all, 540 technology innovators and developers, business and policy leaders, researchers and activists responded in some way to the question covered in this report. More on the methodology underlying this canvassing and the participants can be found in the section titled "About this canvassing of experts." Advances in the internet, artificial intelligence (AI) and online applications have allowed humans to vastly expand their capabilities and increase their capacity to tackle complex problems. These advances have given people the ability to instantly access and share knowledge and amplified their personal and collective power to understand and shape their surroundings. Today there is general agreement that smart machines, bots and systems powered mostly by machine learning and artificial intelligence will quickly increase in speed and sophistication between now and 2035.


The Elements of Visual Art Recommendation: Learning Latent Semantic Representations of Paintings

arXiv.org Artificial Intelligence

Artwork recommendation is challenging because it requires understanding how users interact with highly subjective content, the complexity of the concepts embedded within the artwork, and the emotional and cognitive reflections they may trigger in users. In this paper, we focus on efficiently capturing the elements (i.e., latent semantic relationships) of visual art for personalized recommendation. We propose and study recommender systems based on textual and visual feature learning techniques, as well as their combinations. We then perform a small-scale and a large-scale user-centric evaluation of the quality of the recommendations. Our results indicate that textual features compare favourably with visual ones, whereas a fusion of both captures the most suitable hidden semantic relationships for artwork recommendation. Ultimately, this paper contributes to our understanding of how to deliver content that suitably matches the user's interests and how they are perceived.


DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction

arXiv.org Artificial Intelligence

Personal Digital Assistants (PDAs) - such as Siri, Alexa and Google Assistant, to name a few - play an increasingly important role to access information and complete tasks spanning multiple domains, and by diverse groups of users. A text-to-speech (TTS) module allows PDAs to interact in a natural, human-like manner, and play a vital role when the interaction involves people with visual impairments or other disabilities. To cater to the needs of a diverse set of users, inclusive TTS is important to recognize and pronounce correctly text in different languages and dialects. Despite great progress in speech synthesis, the pronunciation accuracy of named entities in a multi-lingual setting still has a large room for improvement. Existing approaches to correct named entity (NE) mispronunciations, like retraining Grapheme-to-Phoneme (G2P) models, or maintaining a TTS pronunciation dictionary, require expensive annotation of the ground truth pronunciation, which is also time consuming. In this work, we present a highly-precise, PDA-compatible pronunciation learning framework for the task of TTS mispronunciation detection and correction. In addition, we also propose a novel mispronunciation detection model called DTW-SiameseNet, which employs metric learning with a Siamese architecture for Dynamic Time Warping (DTW) with triplet loss. We demonstrate that a locale-agnostic, privacy-preserving solution to the problem of TTS mispronunciation detection is feasible. We evaluate our approach on a real-world dataset, and a corpus of NE pronunciations of an anonymized audio dataset of person names recorded by participants from 10 different locales. Human evaluation shows our proposed approach improves pronunciation accuracy on average by ~6% compared to strong phoneme-based and audio-based baselines.


Item Cold Start Recommendation via Adversarial Variational Auto-encoder Warm-up

arXiv.org Artificial Intelligence

With numerous pieces of information emerging daily and greatly influencing people's lives, large-scale recommendation systems are necessary for timely bridging the users with their desired information. However, the existing widely used embedding-based recommendation systems have a shortcoming in recommending new items because little interaction data is available for training new item ID embedding, which is recognized as item cold start problem. The gap between the randomly initialized item ID embedding and the well-trained warm item ID embedding makes the cold items hard to suit the recommendation system, which is trained on the data of historical warm items. To alleviate the performance decline of new items recommendation, the distribution of the new item ID embedding should be close to that of the historical warm items. To achieve this goal, we propose an Adversarial Variational Autoencoder Warm-up model (AVAEW) to generate warm-up item ID embedding for cold items. Specifically, we develop a conditional variational autoencoder model to leverage the side information of items for generating the warm-up item ID embedding. Particularly, we introduce an adversarial module to enforce the alignment between warm-up item ID embedding distribution and historical item ID embedding distribution. We demonstrate the effectiveness and compatibility of the proposed method by extensive offline experiments on public datasets and online A/B tests on a real-world large-scale news recommendation platform.


Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective

arXiv.org Artificial Intelligence

We study the problem of optimizing a recommender system for outcomes that occur over several weeks or months. We begin by drawing on reinforcement learning to formulate a comprehensive model of users' recurring relationships with a recommender system. Measurement, attribution, and coordination challenges complicate algorithm design. We describe careful modeling -- including a new representation of user state and key conditional independence assumptions -- which overcomes these challenges and leads to simple, testable recommender system prototypes. We apply our approach to a podcast recommender system that makes personalized recommendations to hundreds of millions of listeners. A/B tests demonstrate that purposefully optimizing for long-term outcomes leads to large performance gains over conventional approaches that optimize for short-term proxies.


Self-Supervised Interest Transfer Network via Prototypical Contrastive Learning for Recommendation

arXiv.org Artificial Intelligence

Cross-domain recommendation has attracted increasing attention from industry and academia recently. However, most existing methods do not exploit the interest invariance between domains, which would yield sub-optimal solutions. In this paper, we propose a cross-domain recommendation method: Self-supervised Interest Transfer Network (SITN), which can effectively transfer invariant knowledge between domains via prototypical contrastive learning. Specifically, we perform two levels of cross-domain contrastive learning: 1) instance-to-instance contrastive learning, 2) instance-to-cluster contrastive learning. Not only that, we also take into account users' multi-granularity and multi-view interests. With this paradigm, SITN can explicitly learn the invariant knowledge of interest clusters between domains and accurately capture users' intents and preferences. We conducted extensive experiments on a public dataset and a large-scale industrial dataset collected from one of the world's leading e-commerce corporations. The experimental results indicate that SITN achieves significant improvements over state-of-the-art recommendation methods. Additionally, SITN has been deployed on a micro-video recommendation platform, and the online A/B testing results further demonstrate its practical value. Supplement is available at: https://github.com/fanqieCoffee/SITN-Supplement.


The AI-native telco: Radical transformation to thrive in turbulent times

#artificialintelligence

Artificial intelligence (AI) is unlocking use cases that are transforming industries across a wide swath of the world's economy. From infrastructure that "self-heals" to radically reimagined (and touchless) customer service and experience; from large scale hyper-personalization to automatically created marketing messages and images leveraging Generative AI tools like ChatGPT--it is all a reality today. These AI solutions can powerfully augment and sometimes radically outperform most traditional business roles. This article is a collaborative effort by Joshan Abraham, Jorge Amar, Yuval Atsmon, Miguel Frade, and Tomás Lajous, representing views from McKinsey's Technology, Media & Telecommunications Practice. The impact from these solutions is becoming evident.