Goto

Collaborating Authors

 Principal Component Analysis


Principal Component Analysis When n < p: Challenges and Solutions

Weeraratne, Nuwan, Hunt, Lyn, Kurz, Jason

arXiv.org Machine Learning

Principal Component Analysis is a key technique for reducing the complexity of high-dimensional data while preserving its fundamental data structure, ensuring models remain stable and interpretable. This is achieved by transforming the original variables into a new set of uncorrelated variables (principal components) based on the covariance structure of the original variables. However, since the traditional maximum likelihood covariance estimator does not accurately converge to the true covariance matrix, the standard principal component analysis performs poorly as a dimensionality reduction technique in high-dimensional scenarios $n


Nonlinear Principal Component Analysis with Random Bernoulli Features for Process Monitoring

Chen, Ke, Jiang, Dandan

arXiv.org Machine Learning

The process generates substantial amounts of data with highly complex structures, leading to the development of numerous nonlinear statistical methods. However, most of these methods rely on computations involving large-scale dense kernel matrices. This dependence poses significant challenges in meeting the high computational demands and real-time responsiveness required by online monitoring systems. To alleviate the computational burden of dense large-scale matrix multiplication, we incorporate the bootstrap sampling concept into random feature mapping and propose a novel random Bernoulli principal component analysis method to efficiently capture nonlinear patterns in the process. We derive a convergence bound for the kernel matrix approximation constructed using random Bernoulli features, ensuring theoretical robustness. Subsequently, we design four fast process monitoring methods based on random Bernoulli principal component analysis to extend its nonlinear capabilities for handling diverse fault scenarios. Finally, numerical experiments and real-world data analyses are conducted to evaluate the performance of the proposed methods. Results demonstrate that the proposed methods offer excellent scalability and reduced computational complexity, achieving substantial cost savings with minimal performance loss compared to traditional kernel-based approaches.


Robust Multilinear Principal Component Analysis

Hirari, Mehdi, Centofanti, Fabio, Hubert, Mia, Van Aelst, Stefan

arXiv.org Machine Learning

Multilinear Principal Component Analysis (MPCA) is an important tool for analyzing tensor data. It performs dimension reduction similar to PCA for multivariate data. However, standard MPCA is sensitive to outliers. It is highly influenced by observations deviating from the bulk of the data, called casewise outliers, as well as by individual outlying cells in the tensors, so-called cellwise outliers. This latter type of outlier is highly likely to occur in tensor data, as tensors typically consist of many cells. This paper introduces a novel robust MPCA method that can handle both types of outliers simultaneously, and can cope with missing values as well. This method uses a single loss function to reduce the influence of both casewise and cellwise outliers. The solution that minimizes this loss function is computed using an iteratively reweighted least squares algorithm with a robust initialization. Graphical diagnostic tools are also proposed to identify the different types of outliers that have been found by the new robust MPCA method. The performance of the method and associated graphical displays is assessed through simulations and illustrated on two real datasets.


The Distributionally Robust Optimization Model of Sparse Principal Component Analysis

Wang, Lei, Liu, Xin, Chen, Xiaojun

arXiv.org Machine Learning

We consider sparse principal component analysis (PCA) under a stochastic setting where the underlying probability distribution of the random parameter is uncertain. This problem is formulated as a distributionally robust optimization (DRO) model based on a constructive approach to capturing uncertainty in the covariance matrix, which constitutes a nonsmooth constrained min-max optimization problem. We further prove that the inner maximization problem admits a closed-form solution, reformulating the original DRO model into an equivalent minimization problem on the Stiefel manifold. This transformation leads to a Riemannian optimization problem with intricate nonsmooth terms, a challenging formulation beyond the reach of existing algorithms. To address this issue, we devise an efficient smoothing manifold proximal gradient algorithm. We prove the Riemannian gradient consistency and global convergence of our algorithm to a stationary point of the nonsmooth minimization problem. Moreover, we establish the iteration complexity of our algorithm. Finally, numerical experiments are conducted to validate the effectiveness and scalability of our algorithm, as well as to highlight the necessity and rationality of adopting the DRO model for sparse PCA.


Provable Model-Parallel Distributed Principal Component Analysis with Parallel Deflation

Liao, Fangshuo, Su, Wenyi, Kyrillidis, Anastasios

arXiv.org Artificial Intelligence

We study a distributed Principal Component Analysis (PCA) framework where each worker targets a distinct eigenvector and refines its solution by updating from intermediate solutions provided by peers deemed as "superior". Drawing intuition from the deflation method in centralized eigenvalue problems, our approach breaks the sequential dependency in the deflation steps and allows asynchronous updates of workers, while incurring only a small communication cost. To our knowledge, a gap in the literature -- the theoretical underpinning of such distributed, dynamic interactions among workers -- has remained unaddressed. This paper offers a theoretical analysis explaining why, how, and when these intermediate, hierarchical updates lead to practical and provable convergence in distributed environments. Despite being a theoretical work, our prototype implementation demonstrates that such a distributed PCA algorithm converges effectively and in scalable way: through experiments, our proposed framework offers comparable performance to EigenGame-$\mu$, the state-of-the-art model-parallel PCA solver.


Rethinking Diverse Human Preference Learning through Principal Component Analysis

Luo, Feng, Yang, Rui, Sun, Hao, Deng, Chunyuan, Yao, Jiarui, Shen, Jingyan, Zhang, Huan, Chen, Hanjie

arXiv.org Artificial Intelligence

Understanding human preferences is crucial for improving foundation models and building personalized AI systems. However, preferences are inherently diverse and complex, making it difficult for traditional reward models to capture their full range. While fine-grained preference data can help, collecting it is expensive and hard to scale. In this paper, we introduce Decomposed Reward Models (DRMs), a novel approach that extracts diverse human preferences from binary comparisons without requiring fine-grained annotations. Our key insight is to represent human preferences as vectors and analyze them using Principal Component Analysis (PCA). By constructing a dataset of embedding differences between preferred and rejected responses, DRMs identify orthogonal basis vectors that capture distinct aspects of preference. These decomposed rewards can be flexibly combined to align with different user needs, offering an interpretable and scalable alternative to traditional reward models. We demonstrate that DRMs effectively extract meaningful preference dimensions (e.g., helpfulness, safety, humor) and adapt to new users without additional training. Our results highlight DRMs as a powerful framework for personalized and interpretable LLM alignment.


Cone-Constrained Principal Component Analysis

Yash Deshpande, Andrea Montanari, Emile Richard

Neural Information Processing Systems

Estimating a vector from noisy quadratic observations is a task that arises naturally in many contexts, from dimensionality reduction, to synchronization and phase retrieval problems. It is often the case that additional information is available about the unknown vector (for instance, sparsity, sign or magnitude of its entries). Many authors propose non-convex quadratic optimization problems that aim at exploiting optimally this information. However, solving these problems is typically NP-hard.


Improved Distributed Principal Component Analysis

Yingyu Liang, Maria-Florina F. Balcan, Vandana Kanchanapally, David Woodruff

Neural Information Processing Systems

We study the distributed computing setting in which there are multiple servers, each holding a set of points, who wish to compute functions on the union of their point sets. A key task in this setting is Principal Component Analysis (PCA), in which the servers would like to compute a low dimensional subspace capturing as much of the variance of the union of their point sets as possible. Given a procedure for approximate PCA, one can use it to approximately solve problems such as k-means clustering and low rank approximation. The essential properties of an approximate distributed PCA algorithm are its communication cost and computational efficiency for a given desired accuracy in downstream applications. We give new algorithms and analyses for distributed PCA which lead to improved communication and computational costs for k-means clustering and related problems. Our empirical study on real world data shows a speedup of orders of magnitude, preserving communication with only a negligible degradation in solution quality. Some of these techniques we develop, such as a general transformation from a constant success probability subspace embedding to a high success probability subspace embedding with a dimension and sparsity independent of the success probability, may be of independent interest.


Quasicyclic Principal Component Analysis

Rumsey, Susanna E., Draper, Stark C., Kschischang, Frank R.

arXiv.org Machine Learning

We present quasicyclic principal component analysis (QPCA), a generalization of principal component analysis (PCA), that determines an optimized basis for a dataset in terms of families of shift-orthogonal principal vectors. This is of particular interest when analyzing cyclostationary data, whose cyclic structure is not exploited by the standard PCA algorithm. We first formulate QPCA as an optimization problem, which we show may be decomposed into a series of PCA problems in the frequency domain. We then formalize our solution as an explicit algorithm and analyze its computational complexity. Finally, we provide some examples of applications of QPCA to cyclostationary signal processing data, including an investigation of carrier pulse recovery, a presentation of methods for estimating an unknown oversampling rate, and a discussion of an appropriate approach for pre-processing data with a non-integer oversampling rate in order to better apply the QPCA algorithm.

  Country: North America > Canada > Ontario > Toronto (0.14)
  Genre: Research Report (0.82)
  Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)

Review for NeurIPS paper: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Neural Information Processing Systems

Summary and Contributions: This paper studies the problem of Robust PCA of a subgaussian distribution. Specifically, one is given samples X_1,X_2,...,X_n from a subgaussian distribution, such that an eps-fraction of the samples have been arbitrarily corruption (modified by an adversary), the goal is to approximately recover the top singular vector of the covariance matrix Sigma. Here, approximately recover means to find a vector u such that u T Sigma u (1 - gamma) Sigma _op, where Sigma _op is the operator norm of Sigma. The main result of this paper are runtime and sample complexity efficient algorithms for this task. Specifically, they show 1) An algorithm that achieves error gamma O(eps log(1/eps)) in polynomial time, specifically in tilde{O}(n d 2/eps) time, using n Omega(d /*(eps log(1/eps)) 2) samples.