Estimating the joint probability mass function (PMF) of a set of random variables lies at the heart of statistical learning and signal processing. Without structural assumptions, such as modeling the variables as a Markov chain, tree, or other graphical model, joint PMF estimation is often considered mission impossible - the number of unknowns grows exponentially with the number of variables. But who gives us the structural model? Is there a generic, 'non-parametric' way to control joint PMF complexity without relying on a priori structural assumptions regarding the underlying probability model? Is it possible to discover the operational structure without biasing the analysis up front? What if we only observe random subsets of the variables, can we still reliably estimate the joint PMF of all? This paper shows, perhaps surprisingly, that if the joint PMF of any three variables can be estimated, then the joint PMF of all the variables can be provably recovered under relatively mild conditions. The result is reminiscent of Kolmogorov's extension theorem - consistent specification of lower-order distributions induces a unique probability measure for the entire process. The difference is that for processes of limited complexity (rank of the high-order PMF) it is possible to obtain complete characterization from only third-order distributions. In fact not all third order PMFs are needed; and under more stringent conditions even second-order will do. Exploiting multilinear (tensor) algebra, this paper proves that such higher-order PMF completion can be guaranteed - several pertinent identifiability results are derived. It also provides a practical and efficient algorithm to carry out the recovery task. Judiciously designed simulations and real-data experiments on movie recommendation and data classification are presented to showcase the effectiveness of the approach.
We study the problem of learning a mixture model of non-parametric product distributions. The problem of learning a mixture model is that of finding the component distributions along with the mixing weights using observed samples generated from the mixture. The problem is well-studied in the parametric setting, i.e., when the component distributions are members of a parametric family -- such as Gaussian distributions. In this work, we focus on multivariate mixtures of non-parametric product distributions and propose a two-stage approach which recovers the component distributions of the mixture under a smoothness condition. Our approach builds upon the identifiability properties of the canonical polyadic (low-rank) decomposition of tensors, in tandem with Fourier and Shannon-Nyquist sampling staples from signal processing. We demonstrate the effectiveness of the approach on synthetic and real datasets.
Function approximation from input and output data pairs constitutes a fundamental problem in supervised learning. Deep neural networks are currently the most popular method for learning to mimic the input-output relationship of a generic nonlinear system, as they have proven to be very effective in approximating complex highly nonlinear functions. In this work, we propose low-rank tensor completion as an appealing alternative for modeling and learning complex nonlinear systems. We model the interactions between the $N$ input variables and the scalar output of a system by a single N-way tensor, and setup a weighted low-rank tensor completion problem with smoothness regularization which we tackle using a block coordinate descent algorithm. We extend our method to the multi-output setting and the case of partially observed data, which cannot be readily handled by neural networks. Finally, we demonstrate the effectiveness of the approach using several regression tasks including some standard benchmarks and a challenging student grade prediction task.
To ensure interpretability of extracted sources in tensor decomposition, we introduce in this paper a dictionary-based tensor canonical polyadic decomposition which enforces one factor to belong exactly to a known dictionary. A new formulation of sparse coding is proposed which enables high dimensional tensors dictionary-based canonical polyadic decomposition. The benefits of using a dictionary in tensor decomposition models are explored both in terms of parameter identifiability and estimation accuracy. Performances of the proposed algorithms are evaluated on the decomposition of simulated data and the unmixing of hyperspectral images.
Tensor completion is a problem of filling the missing or unobserved entries of partially observed tensors. Due to the multidimensional character of tensors in describing complex datasets, tensor completion algorithms and their applications have received wide attention and achievement in data mining, computer vision, signal processing, and neuroscience, etc. In this survey, we provide a modern overview of recent advances in tensor completion algorithms from the perspective of big data analytics characterized by diverse variety, large volume, and high velocity. Towards a better comprehension and comparison of vast existing advances, we summarize and categorize them into four groups including general tensor completion algorithms, tensor completion with auxiliary information (variety), scalable tensor completion algorithms (volume) and dynamic tensor completion algorithms (velocity). Besides, we introduce their applications on real-world data-driven problems and present an open-source package covering several widely used tensor decomposition and completion algorithms. Our goal is to summarize these popular methods and introduce them to researchers for promoting the research process in this field and give an available repository for practitioners. In the end, we also discuss some challenges and promising research directions in this community for future explorations.