Personal Assistant Systems
Individualized Rank Aggregation using Nuclear Norm Regularization
In recent years rank aggregation has received significant attention from the machine learning community. The goal of such a problem is to combine the (partially revealed) preferences over objects of a large population into a single, relatively consistent ordering of those objects. However, in many cases, we might not want a single ranking and instead opt for individual rankings. We study a version of the problem known as collaborative ranking. In this problem we assume that individual users provide us with pairwise preferences (for example purchasing one item over another). From those preferences we wish to obtain rankings on items that the users have not had an opportunity to explore. The results here have a very interesting connection to the standard matrix completion problem. We provide a theoretical justification for a nuclear norm regularized optimization procedure, and provide high-dimensional scaling results that show how the error in estimating user preferences behaves as the number of observations increase.
Improving Cross-domain Recommendation through Probabilistic Cluster-level Latent Factor Model--Extended Version
Cross-domain recommendation has been proposed to transfer user behavior pattern by pooling together the rating data from multiple domains to alleviate the sparsity problem appearing in single rating domains. However, previous models only assume that multiple domains share a latent common rating pattern based on the user-item co-clustering. To capture diversities among different domains, we propose a novel Probabilistic Cluster-level Latent Factor (PCLF) model to improve the cross-domain recommendation performance. Experiments on several real world datasets demonstrate that our proposed model outperforms the state-of-the-art methods for the cross-domain recommendation task.
Structured Low-Rank Matrix Factorization with Missing and Grossly Corrupted Observations
Shang, Fanhua, Liu, Yuanyuan, Tong, Hanghang, Cheng, James, Cheng, Hong
Recovering low-rank and sparse matrices from incomplete or corrupted observations is an important problem in machine learning, statistics, bioinformatics, computer vision, as well as signal and image processing. In theory, this problem can be solved by the natural convex joint/mixed relaxations (i.e., l_{1}-norm and trace norm) under certain conditions. However, all current provable algorithms suffer from superlinear per-iteration cost, which severely limits their applicability to large-scale problems. In this paper, we propose a scalable, provable structured low-rank matrix factorization method to recover low-rank and sparse matrices from missing and grossly corrupted data, i.e., robust matrix completion (RMC) problems, or incomplete and grossly corrupted measurements, i.e., compressive principal component pursuit (CPCP) problems. Specifically, we first present two small-scale matrix trace norm regularized bilinear structured factorization models for RMC and CPCP problems, in which repetitively calculating SVD of a large-scale matrix is replaced by updating two much smaller factor matrices. Then, we apply the alternating direction method of multipliers (ADMM) to efficiently solve the RMC problems. Finally, we provide the convergence analysis of our algorithm, and extend it to address general CPCP problems. Experimental results verified both the efficiency and effectiveness of our method compared with the state-of-the-art methods.
Guess Who Rated This Movie: Identifying Users Through Subspace Clustering
Zhang, Amy, Fawaz, Nadia, Ioannidis, Stratis, Montanari, Andrea
It is often the case that, within an online recommender system, multiple users share a common account. Can such shared accounts be identified solely on the basis of the userprovided ratings? Once a shared account is identified, can the different users sharing it be identified as well? Whenever such user identification is feasible, it opens the way to possible improvements in personalized recommendations, but also raises privacy concerns. We develop a model for composite accounts based on unions of linear subspaces, and use subspace clustering for carrying out the identification task. We show that a significant fraction of such accounts is identifiable in a reliable manner, and illustrate potential uses for personalized recommendation.
Conditional Restricted Boltzmann Machines for Cold Start Recommendations
Restricted Boltzman Machines (RBMs) have been successfully used in recommender systems. However, as with most of other collaborative filtering techniques, it cannot solve cold start problems for there is no rating for a new item. In this paper, we first apply conditional RBM (CRBM) which could take extra information into account and show that CRBM could solve cold start problem very well, especially for rating prediction task. CRBM naturally combine the content and collaborative data under a single framework which could be fitted effectively. Experiments show that CRBM can be compared favourably with matrix factorization models, while hidden features learned from the former models are more easy to be interpreted.
Learning From Ordered Sets and Applications in Collaborative Ranking
Tran, Truyen, Phung, Dinh, Venkatesh, Svetha
Ranking over sets arise when users choose between groups of items. For example, a group may be of those movies deemed $5$ stars to them, or a customized tour package. It turns out, to model this data type properly, we need to investigate the general combinatorics problem of partitioning a set and ordering the subsets. Here we construct a probabilistic log-linear model over a set of ordered subsets. Inference in this combinatorial space is highly challenging: The space size approaches $(N!/2)6.93145^{N+1}$ as $N$ approaches infinity. We propose a \texttt{split-and-merge} Metropolis-Hastings procedure that can explore the state-space efficiently. For discovering hidden aspects in the data, we enrich the model with latent binary variables so that the posteriors can be efficiently evaluated. Finally, we evaluate the proposed model on large-scale collaborative filtering tasks and demonstrate that it is competitive against state-of-the-art methods.
Bayesian Probabilistic Matrix Factorization: A User Frequency Analysis
Severinski, Cody, Salakhutdinov, Ruslan
Matrix factorization (MF) has become a common approach to collaborative filtering, due to ease of implementation and scalability to large data sets. Two existing drawbacks of the basic model is that it does not incorporate side information on either users or items, and assumes a common variance for all users. We extend the work of constrained probabilistic matrix factorization by deriving the Gibbs updates for the side feature vectors for items (Salakhutdinov and Minh, 2008). We show that this Bayesian treatment to the constrained PMF model outperforms simple MAP estimation. We also consider extensions to heteroskedastic precision introduced in the literature (Lakshminarayanan, Bouchard, and Archambeau, 2011). We show that this tends result in overfitting for deterministic approximation algorithms (ex: Variational inference) when the observed entries in the user / item matrix are distributed in an non-uniform manner. In light of this, we propose a truncated precision model. Our experimental results suggest that this model tends to delay overfitting.
Spectral Bandits for Smooth Graph Functions with Applications in Recommender Systems
Kocák, Tomáš (Inria Lille) | Valko, Michal (Inria Lille) | Munos, Rémi ( Inria Lille and Microsoft Research New England ) | Kveton, Branislav (Technicolor California) | Agrawal, Shipra (Microsoft Research India)
Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as content-based recommendation. In this problem, each recommended item is a node and its expected rating is similar to its neighbors. The goal is to recommend items that have high expected ratings. We aim for the algorithms where the cumulative regret would not scale poorly with the number of nodes. In particular, we introduce the notion of an effective dimension, which is small in real-world graphs, and propose two algorithms for solving our problem that scale linearly in this dimension. Our experiments on real-world content recommendation problem show that a good estimator of user preferences for thousands of items can be learned from just tens nodes evaluations.
Uncovering Response Biases in Recommendation
Park, Kyung-Wha (Seoul National University) | Kim, Byoung-Hee (Seoul National University) | Park, Tae-Suh (Seoul National University) | Zhang, Byoung-Tak (Seoul National University)
An user-specific tendency of biased movie rating is investigated, leading six identified types of rating pattern in a massive movie rating dataset. Based on the observed bias assumption, we propose a rescaling method of preferential scores by considering the rating types. Experimental results show significant enhancement for movie recommendation systems.
Preface
Srivastava, Biplav (IBM Research - India) | Lozano, Aurelie (IBM TJ Watson Research Center) | Marecki, Janusz (IBM TJ Watson Research Center) | Rish, Irina (IBM TJ Watson Research Center) | Salakhudtinov, Ruslan (University of Toronto) | Tesauro, Gerald (IBM TJ Watson Research Center) | Veloso, Manuela (Carnegie Mellon University)
This workshop seeks to augment human decision making by exploiting synergies across two areas of AI research where exciting research progress has been made in recent years, but which so far have not had an explicit common venue. The first area has to do with powerful new learning techniques that may have the potential to automatically learn complex tasks by directly training on massive amounts of raw data, much of which may be unlabeled, unstructured, and multi-modal in form (natural language text/speech, audio, video, and others). These techniques include deep learning, manifold learning, sparsity-based techniques, and transfer/cross-modal learning and inference methods. Researchers employing such techniques have recently achieved quantum performance leaps in speech and image recognition tasks, and have also demonstrated the ability to learn complex feature representations entirely from unlabeled data. The second area has to do with enabling computers to understand and work with naturalistic input from humans, in the form of natural language speech or text, visual input such as gestures or facial expressions, and haptic (touch-based) inputs. The most exciting demonstrations of these capabilities in the last few years include Question-Answering systems such as Watson and Wolfram Alpha, and commercially deployed personal assistant technology such as Siri, Google Now, Dragon Mobile Assistant, Nina, and TellMe. Synergistic advances in these two trends could vastly improve human decision making in many scenarios, including information overload (such as driving), cognition impairment (for example Alzheimer's) or collective (multi-objective) decision-making (such as conference program scheduling, disaster response). Cognitive computing is an emerging research topic inspired by