Principal Component Analysis When n < p: Challenges and Solutions

Weeraratne, Nuwan, Hunt, Lyn, Kurz, Jason

Mar-21-2025–arXiv.org Machine Learning

Principal Component Analysis is a key technique for reducing the complexity of high-dimensional data while preserving its fundamental data structure, ensuring models remain stable and interpretable. This is achieved by transforming the original variables into a new set of uncorrelated variables (principal components) based on the covariance structure of the original variables. However, since the traditional maximum likelihood covariance estimator does not accurately converge to the true covariance matrix, the standard principal component analysis performs poorly as a dimensionality reduction technique in high-dimensional scenarios $n