Stochastic Subsampling for Factorizing Huge Matrices
Mensch, Arthur, Mairal, Julien, Thirion, Bertrand, Varoquaux, Gael
Matrix factorization is a flexible approach to uncover latent factors in low-rank or sparse models. With sparse factors, it is used in dictionary learning, and has proven very effective for denoising and visual feature encoding in signal and computer vision [see e.g., 1]. When the data admit a low-rank structure, matrix factorization has proven very powerful for various tasks such as matrix completion [2, 3], word embedding [4, 5], or network models [6]. It is flexible enough to accommodate a large set of constraints and regularizations, and has gained significant attention in scientific domains where interpretability is a key aspect, such as genetics [7] and neuroscience [8]. In this paper, our goal is to adapt matrix-factorization techniques to huge-dimensional datasets, i.e., with large number of columns n and large number of rows p. Specifically, our work is motivated by the rapid increase in sensor resolution, as in hyperspectral imaging or fMRI, and the challenge that the resulting high-dimensional signals pose to current algorithms.
Oct-30-2017
- Genre:
- Research Report (0.81)
- Industry:
- Education (1.00)
- Health & Medicine
- Therapeutic Area > Neurology (1.00)
- Health Care Technology (1.00)
- Diagnostic Medicine (0.93)
- Technology: