High-Dimensional Partial Least Squares: Spectral Analysis and Fundamental Limitations
Léger, Victor, Chatelain, Florent
Partial Least Squares (PLS) is a widely used method for data integration, designed to extract latent components shared across paired high-dimensional datasets. Despite decades of practical success, a precise theoretical understanding of its behavior in high-dimensional regimes remains limited. In this paper, we study a data integration model in which two high-dimensional data matrices share a low-rank common latent structure while also containing individual-specific components. We analyze the singular vectors of the associated cross-covariance matrix using tools from random matrix theory and derive asymptotic characterizations of the alignment between estimated and true latent directions. These results provide a quantitative explanation of the reconstruction performance of the PLS variant based on Singular Value Decomposition (PLS-SVD) and identify regimes where the method exhibits counter-intuitive or limiting behavior. Building on this analysis, we compare PLS-SVD with principal component analysis applied separately to each dataset and show its asymptotic superiority in detecting the common latent subspace. Overall, our results offer a comprehensive theoretical understanding of high-dimensional PLS-SVD, clarifying both its advantages and fundamental limitations.
Dec-18-2025
- Country:
- Africa > Middle East
- Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- Asia > Russia (0.04)
- Europe
- France > Auvergne-Rhône-Alpes
- Russia (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Africa > Middle East
- Genre:
- Research Report > New Finding (0.47)
- Industry:
- Health & Medicine > Therapeutic Area > Oncology (0.67)
- Technology: