Performance Gaps in Multi-view Clustering under the Nested Matrix-Tensor Model
Lebeau, Hugo, Seddik, Mohamed El Amine, Goulart, José Henrique de Morais
–arXiv.org Artificial Intelligence
We study the estimation of a planted signal hidden in a recently introduced nested matrix-tensor model, which is an extension of the classical spiked rank-one tensor model, motivated by multi-view clustering. Prior work has theoretically examined the performance of a tensor-based approach, which relies on finding a best rank-one approximation, a problem known to be computationally hard. A tractable alternative approach consists in computing instead the best rank-one (matrix) approximation of an unfolding of the observed tensor data, but its performance was hitherto unknown. We quantify here the performance gap between these two approaches, in particular by deriving the precise algorithmic threshold of the unfolding approach and demonstrating that it exhibits a BBP-type transition behavior (Baik et al., 2005). This work is therefore in line with recent contributions which deepen our understanding of why tensor-based methods surpass matrix-based methods in handling structured tensor data. In the age of artificial intelligence, handling vast amounts of data has become a fundamental aspect of machine learning tasks. Datasets are often high-dimensional and composed of multiple modes, such as various modalities, sensors, sources, types, or domains, naturally lending themselves to be represented as tensors. Tensors offer a richer structure compared to traditional one-dimensional vectors and two-dimensional matrices, making them increasingly relevant in various applications, including statistical learning and data analysis (Landsberg, 2012; Sun et al., 2014). Y et, in the existing literature, there is a notable scarcity of theoretical studies that specifically address the performance gaps between tensor-based methods and traditional (matrix) spectral methods in the context of high-dimensional data analysis. While tensor methods have shown promise in various applications, including multi-view clustering, co-clustering, community detection, and latent variable modeling (Wu et al., 2019; Anandkumar et al., 2014; Papalexakis et al., 2012; Wang et al., 2023), little attention has been devoted to rigorously quantifying the advantages and drawbacks of leveraging the hidden low-rank tensor structure.
arXiv.org Artificial Intelligence
Feb-16-2024