Provable Tensor Factorization with Missing Data

Neural Information Processing Systems

We study the problem of low-rank tensor factorization in the presence of missing data. We ask the following question: how many sampled entries do we need, to efficiently and exactly reconstruct a tensor with a low-rank orthogonal decomposition? We propose a novel alternating minimization based method which iteratively refines estimates of the singular vectors. We show that under certain standard assumptions, our method can recover a three-mode $n\times n\times n$ dimensional rank-$r$ tensor exactly from $O(n^{3/2} r^5 \log^4 n)$ randomly sampled entries. In the process of proving this result, we solve two challenging sub-problems for tensors with missing data. First, in analyzing the initialization step, we prove a generalization of a celebrated result by Szemer\'edie et al. on the spectrum of random graphs. Next, we prove global convergence of alternating minimization with a good initialization. Simulations suggest that the dependence of the sample size on dimensionality $n$ is indeed tight.


Parallel Higher Order Alternating Least Square for Tensor Recommender System

AAAI Conferences

Many modern recommender systems rely on matrix factor-ization techniques to produce personalized recommendationson the basis of the feedback that users provided on differ-ent items in the past. The feedback may take different forms,such as the rating of a movie, or the number of times a userlistened to the songs of a given music band. Nonetheless, insome situations, the user can perform several actions on eachitem, and the feedback is multidimensional (e.g., the user ofan e-commerce website can either click on a product, add theproduct to her cart or buy it). In this case, one can no longerview the recommendation problem as a matrix completion,unless the problem is reduced to a series of multiple inde-pendent problems, thus loosing the correlation between thedifferent actions. In this case, the most suitable approach is touse a tensor approach to learn all dimensions of the feedbacksimultaneously. In this paper, we propose a specific instanceof tensor completion and we show how it can be heavily par-allelized over both the dimensions (i.e., items, users, actions)and within each dimension (i.e., each item separately). Wevalidate the proposed method both in terms of prediction ac-curacy and scalability to large datasets.


Tensor p-shrinkage nuclear norm for low-rank tensor completion

arXiv.org Machine Learning

In this paper, a new definition of tensor p-shrinkage nuclear norm (p-TNN) is proposed based on tensor singular value decomposition (t-SVD). In particular, it can be proved that p-TNN is a better approximation of the tensor average rank than the tensor nuclear norm when p < 1. Therefore, by employing the p-shrinkage nuclear norm, a novel low-rank tensor completion (LRTC) model is proposed to estimate a tensor from its partial observations. Statistically, the upper bound of recovery error is provided for the LRTC model. Furthermore, an efficient algorithm, accelerated by the adaptive momentum scheme, is developed to solve the resulting nonconvex optimization problem. It can be further guaranteed that the algorithm enjoys a global convergence rate under the smoothness assumption. Numerical experiments conducted on both synthetic and real-world data sets verify our results and demonstrate the superiority of our p-TNN in LRTC problems over several state-of-the-art methods.


A New Sampling Technique for Tensors

arXiv.org Machine Learning

In this paper we propose new techniques to sample arbitrary third-order tensors, with an objective of speeding up tensor algorithms that have recently gained popularity in machine learning. Our main contribution is a new way to select, in a biased random way, only $O(n^{1.5}/\epsilon^2)$ of the possible $n^3$ elements while still achieving each of the three goals: \\ {\em (a) tensor sparsification}: for a tensor that has to be formed from arbitrary samples, compute very few elements to get a good spectral approximation, and for arbitrary orthogonal tensors {\em (b) tensor completion:} recover an exactly low-rank tensor from a small number of samples via alternating least squares, or {\em (c) tensor factorization:} approximating factors of a low-rank tensor corrupted by noise. \\ Our sampling can be used along with existing tensor-based algorithms to speed them up, removing the computational bottleneck in these methods.


Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality

arXiv.org Machine Learning

An important reason for such an increase is the effective representation of multiway data using a tensor structure. One example is the recommender system (Bi et al., 2018), which can be naturally described as a three-way tensor of user item context and each entry indicates the user-item interaction. Another example is the DBLP database (Zhe et al., 2016), which is organized into a three-way tensor of author word venue and each entry indicates the co-occurrence of the triplets. Whereas many real-world multiway datasets have continuous-valued entries, there have recently emerged more instances of binary tensors, in which all tensor entries are binary indicators 0/1. Examples include click/no-click action in recommender systems (Sun et al., 2017), multi-relational social networks (Nickel et al., 2011), and brain structural connectivity networks (Wang et al., 2017a).