We develop collaborative topic Poisson factorization (CTPF), a generative model of articles and reader preferences. CTPF can be used to build recommender systems by learning from reader histories and content to recommend personalized articles of interest. In detail, CTPF models both reader behavior and article texts with Poisson distributions, connecting the latent topics that represent the texts with the latent preferences that represent the readers. This provides better recommendations than competing methods and gives an interpretable latent space for understanding patterns of readership. Further, we exploit stochastic variational inference to model massive real-world datasets. For example, we can fit CPTF to the full arXiv usage dataset, which contains over 43 million ratings and 42 million word counts, within a day. We demonstrate empirically that our model outperforms several baselines, including the previous state-of-the-art approach.
Modelling sparse and large data sets is highly in demand yet challenging in recommender systems. With the computation only on the non-zero ratings, Poisson Factorization (PF) enabled by variational inference has shown its high efficiency in scalable recommendation, e.g., modeling millions of ratings. However, as PF learns the ratings by individual users on items with the Gamma distribution, it cannot capture the coupling relations between users (items) and the rating popularity (i.e., favorable rating scores that are given to one item) and rating sparsity (i.e., those users (items) with many zero ratings) for one item (user). This work proposes Coupled Poisson Factorization (CPF) to learn the couplings between users (items), and the user/item attributes (i.e., metadata) are integrated into CPF to form the Metadata-integrated CPF (mCPF) to not only handle sparse but also popular ratings in very large-scale data. Our empirical results show that the proposed models significantly outperform PF and address the key limitations in PF for scalable recommendation.
Model-based collaborative filtering analyzes user-item interactions to infer latent factors that represent user preferences and item characteristics in order to predict future interactions. Most collaborative filtering algorithms assume that these latent factors are static, although it has been shown that user preferences and item perceptions drift over time. In this paper, we propose a conjugate and numerically stable dynamic matrix factorization (DCPF) based on compound Poisson matrix factorization that models the smoothly drifting latent factors using Gamma-Markov chains. We propose a numerically stable Gamma chain construction, and then present a stochastic variational inference approach to estimate the parameters of our model. We apply our model to time-stamped ratings data sets: Netflix, Yelp, and Last.fm, where DCPF achieves a higher predictive accuracy than state-of-the-art static and dynamic factorization models.
Models for recommender systems use latent factors to explain the preferences and behaviors of users with respect to a set of items (e.g., movies, books, academic papers). Typically, the latent factors are assumed to be static and, given these factors, the observed preferences and behaviors of users are assumed to be generated without order. These assumptions limit the explorative and predictive capabilities of such models, since users' interests and item popularity may evolve over time. To address this, we propose dPF, a dynamic matrix factorization model based on the recent Poisson factorization model for recommendations. dPF models the time evolving latent factors with a Kalman filter and the actions with Poisson distributions. We derive a scalable variational inference algorithm to infer the latent factors. Finally, we demonstrate dPF on 10 years of user click data from arXiv.org, one of the largest repository of scientific papers and a formidable source of information about the behavior of scientists. Empirically we show performance improvement over both static and, more recently proposed, dynamic recommendation models. We also provide a thorough exploration of the inferred posteriors over the latent variables.
A conjugate Gamma-Poisson model for Dynamic Matrix Factorization incorporated with metadata influence (mGDMF for short) is proposed to effectively and efficiently model massive, sparse and dynamic data in recommendations. Modeling recommendation problems with a massive number of ratings and very sparse or even no ratings on some users/items in a dynamic setting is very demanding and poses critical challenges to well-studied matrix factorization models due to the large-scale, sparse and dynamic nature of the data. Our proposed mGDMF tackles these challenges by introducing three strategies: (1) constructing a stable Gamma-Markov chain model that smoothly drifts over time by combining both static and dynamic latent features of data; (2) incorporating the user/item metadata into the model to tackle sparse ratings; and (3) undertaking stochastic variational inference to efficiently handle massive data. mGDMF is conjugate, dynamic and scalable. Experiments show that mGDMF significantly (both effectively and efficiently) outperforms the state-of-the-art static and dynamic models on large, sparse and dynamic data.