Streaming tensor factorization is a powerful tool for processing high-volume and multi-way temporal data in Internet networks, recommender systems and image/video data analysis. Existing streaming tensor factorization algorithms rely on least-squares data fitting and they do not possess a mechanism for tensor rank determination. This leaves them susceptible to outliers and vulnerable to over-fitting. This paper presents a Bayesian robust streaming tensor factorization model to identify sparse outliers, automatically determine the underlying tensor rank and accurately fit low-rank structure. We implement our model in Matlab and compare it with existing algorithms on tensor datasets generated from dynamic MRI and Internet traffic.
Tensor completion is a problem of filling the missing or unobserved entries of partially observed tensors. Due to the multidimensional character of tensors in describing complex datasets, tensor completion algorithms and their applications have received wide attention and achievement in data mining, computer vision, signal processing, and neuroscience, etc. In this survey, we provide a modern overview of recent advances in tensor completion algorithms from the perspective of big data analytics characterized by diverse variety, large volume, and high velocity. Towards a better comprehension and comparison of vast existing advances, we summarize and categorize them into four groups including general tensor completion algorithms, tensor completion with auxiliary information (variety), scalable tensor completion algorithms (volume) and dynamic tensor completion algorithms (velocity). Besides, we introduce their applications on real-world data-driven problems and present an open-source package covering several widely used tensor decomposition and completion algorithms. Our goal is to summarize these popular methods and introduce them to researchers for promoting the research process in this field and give an available repository for practitioners. In the end, we also discuss some challenges and promising research directions in this community for future explorations.
Probabilistic approaches for tensor factorization aim to extract meaningful structure from incomplete data by postulating low rank constraints. Recently, variational Bayesian (VB) inference techniques have successfully been applied to large scale models. This paper presents full Bayesian inference via VB on both single and coupled tensor factorization models. Our method can be run even for very large models and is easily implemented. It exhibits better prediction performance than existing approaches based on maximum likelihood on several real-world datasets for missing link prediction problem.
We present a probabilistic model for tensor decomposition where one or more tensor modes may have side-information about the mode entities in form of their features and/or their adjacency network. We consider a Bayesian approach based on the Canonical PARAFAC (CP) decomposition and enrich this single-layer decomposition approach with a two-layer decomposition. The second layer fits a factor model for each layer-one factor matrix and models the factor matrix via the mode entities' features and/or the network between the mode entities. The second-layer decomposition of each factor matrix also learns a binary latent representation for the entities of that mode, which can be useful in its own right. Our model can handle both continuous as well as binary tensor observations. Another appealing aspect of our model is the simplicity of the model inference, with easy-to-sample Gibbs updates. We demonstrate the results of our model on several benchmarks datasets, consisting of both real and binary tensors.
Many modern recommender systems rely on matrix factor-ization techniques to produce personalized recommendationson the basis of the feedback that users provided on differ-ent items in the past. The feedback may take different forms,such as the rating of a movie, or the number of times a userlistened to the songs of a given music band. Nonetheless, insome situations, the user can perform several actions on eachitem, and the feedback is multidimensional (e.g., the user ofan e-commerce website can either click on a product, add theproduct to her cart or buy it). In this case, one can no longerview the recommendation problem as a matrix completion,unless the problem is reduced to a series of multiple inde-pendent problems, thus loosing the correlation between thedifferent actions. In this case, the most suitable approach is touse a tensor approach to learn all dimensions of the feedbacksimultaneously. In this paper, we propose a specific instanceof tensor completion and we show how it can be heavily par-allelized over both the dimensions (i.e., items, users, actions)and within each dimension (i.e., each item separately). Wevalidate the proposed method both in terms of prediction ac-curacy and scalability to large datasets.