missing-data-induced phase transition
Missing-Data-Induced Phase Transitions in Spectral PLS for Multimodal Learning
Gjølbye, Anders, Kargaard, Ida, Kargaard, Emma, Hansen, Lars Kai
Partial Least Squares (PLS) learns shared structure from paired data via the top singular vectors of the empirical cross-covariance (PLS-SVD), but multimodal datasets often have missing entries in both views. We study PLS-SVD under independent entry-wise missing-completely-at-random masking in a proportional high-dimensional spiked model. After appropriate normalization, the masked cross-covariance behaves like a spiked rectangular random matrix whose effective signal strength is attenuated by $\sqrtρ$, where $ρ$ is the joint entry retention probability. As a result, PLS-SVD exhibits a sharp BBP-type phase transition: below a critical signal-to-noise threshold the leading singular vectors are asymptotically uninformative, while above it they achieve nontrivial alignment with the latent shared directions, with closed-form asymptotic overlap formulas. Simulations and semi-synthetic multimodal experiments corroborate the predicted phase diagram and recovery curves across aspect ratios, signal strengths, and missingness levels.
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
- Health & Medicine > Therapeutic Area > Oncology (0.68)