Principal Component Analysis
Regularized Tensor Factorizations and Higher-Order Principal Components Analysis
High-dimensional tensors or multi-way data are becoming prevalent in areas such as biomedical imaging, chemometrics, networking and bibliometrics. Traditional approaches to finding lower dimensional representations of tensor data include flattening the data and applying matrix factorizations such as principal components analysis (PCA) or employing tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions. The former can lose important structure in the data, while the latter Higher-Order PCA (HOPCA) methods can be problematic in high-dimensions with many irrelevant features. We introduce frameworks for sparse tensor factorizations or Sparse HOPCA based on heuristic algorithmic approaches and by solving penalized optimization problems related to the CP decomposition. Extensions of these approaches lead to methods for general regularized tensor factorizations, multi-way Functional HOPCA and generalizations of HOPCA for structured data. We illustrate the utility of our methods for dimension reduction, feature selection, and signal recovery on simulated data and multi-dimensional microarrays and functional MRIs.
Large-Scale Sparse Principal Component Analysis with Application to Text Data
Zhang, Youwei, Ghaoui, Laurent E.
Sparse PCA provides a linear combination of small number of features that maximizes variance across data. Although Sparse PCA has apparent advantages compared to PCA, such as better interpretability, it is generally thought to be computationally much more expensive. In this paper, we demonstrate the surprising fact that sparse PCA can be easier than PCA in practice, and that it can be reliably applied to very large data sets. This comes from a rigorous feature elimination pre-processing result, coupled with the favorable fact that features in real-life data typically have exponentially decreasing variances, which allows for many features to be eliminated. We introduce a fast block coordinate ascent algorithm with much better computational complexity than the existing first-order ones. We provide experimental results obtained on text corpora involving millions of documents and hundreds of thousands of features. These results illustrate how Sparse PCA can help organize a large corpus of text data in a user-interpretable way, providing an attractive alternative approach to topic models.
Demixed Principal Component Analysis
Brendel, Wieland, Romo, Ranulfo, Machens, Christian K.
In many experiments, the data points collected live in high-dimensional observation spaces, yet can be assigned a set of labels or parameters. In electrophysiological recordings, for instance, the responses of populations of neurons generally depend on mixtures of experimentally controlled parameters. The heterogeneity and diversity of these parameter dependencies can make visualization and interpretation of such data extremely difficult. Standard dimensionality reduction techniques such as principal component analysis (PCA) can provide a succinct and complete description of the data, but the description is constructed independent of the relevant task variables and is often hard to interpret. Here, we start with the assumption that a particularly informative description is one that reveals the dependency of the high-dimensional data on the individual parameters. We show how to modify the loss function of PCA so that the principal components seek to capture both the maximum amount of variance about the data, while also depending on a minimum number of parameters. We call this method demixed principal component analysis (dPCA) as the principal components here segregate the parameter dependencies. We phrase the problem as a probabilistic graphical model, and present a fast Expectation-Maximization (EM) algorithm. We demonstrate the use of this algorithm for electrophysiological data and show that it serves to demix the parameter-dependence of a neural population response.
Truncated Power Method for Sparse Eigenvalue Problems
The sparsity is controlled by the values of k and can be viewed as a design parameter. In machine learning applications, e.g., principal component analysis, this problem is motivated from the following perturbation formulation of matrix A: A ฤ E, (1.2) where A is the empirical covariance matrix, ฤ is the true covariance matrix, and E is a random perturbation due to having only a finite number of empirical samples. If we assume that the largest eigenvector x of ฤ is sparse, then a natural question is to recover x from the noisy observation A when the error E is "small". In this context, the problem (1.1) is also referred to as sparse principal component analysis (sparse PCA). 1 In general, problem (1.1) is non-convex. In fact, it is also NPhard because it can be reduced to the subset selection problem for ordinary least squares regression (Moghaddam et al., 2006), which is known to be NP hard.
Robust Principal Component Analysis with Non-Greedy โ1-Norm Maximization
Nie, Feiping (University of Texas at Arlington) | Huang, Heng (University of Texas at Arlington) | Ding, Chris (University of Texas at Arlington) | Luo, Dijun (University of Texas at Arlington) | Wang, Hua (University of Texas at Arlington)
Principal Component Analysis (PCA) is one of the most important methods to handle high-dimensional data. However, the high computa-tional complexity makes it hard to apply to the large scale data with high dimensionality, and the used โ2-norm makes it sensitive to outliers. A recent work proposed principal component analysis based on โ1-norm maximization, which is efficient and robust to outliers. In that work, a greedy strategy was applied due to the difficulty of directly solving the โ1-norm maximization problem, which is easy to get stuck in local solution. In this paper, we first propose an efficient optimization algorithm to solve a general โ1-norm maximization problem, and then propose a robust principal component analysis with non-greedy โ1-norm maximization. Experimental results on real world datasets show that the non-greedy method always obtains much better solution than that of the greedy method.
High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections
Cocco, Simona, Monasson, Remi, Sessak, Vitor
Understanding the patterns of correlations between the components of complex systems is a fundamental issue in various scientific fields, ranging from neurobiology to genomic, from finance to sociology,... A recurrent problem is to distinguish between direct correlations, produced by physiological or functional interactions between the components, and network correlations, which are mediated by other, third-party components. Various approaches have been proposed to infer interactions from correlations, exploiting concepts related to statistical dimensional reduction [1], causality [2], the maximum entropy principle [3], Markov random fields [4]... A major practical and theoretical difficulty in doing so is the paucity and the quality of data: reliable analysis should be able to unveil real patterns of interactions, even if measures are affected by under-or noisy sampling. The size of the interaction network can be comparable to or larger than the number of data, a situation referred to as highdimensional inference. The purpose of the present work is to establish a quantitative correspondence between two of those approaches, namely the inference of Boltzmann Machines (also called Ising model in statistical physics and undirected graphical models for discrete variables in statistical inference [4]) and Principal Component Analysis (PCA) [1]. Inverse Boltzmann Machines (BM) are a mathematically well-founded but computationally challenging approach to infer interactions from correlations.
Auto-associative models, nonlinear Principal component analysis, manifolds and projection pursuit
Girard, Stรฉphane, Iovleff, Serge
In this paper, auto-associative models are proposed as candidates to the generalization of Principal Component Analysis. We show that these models are dedicated to the approximation of the dataset by a manifold. Here, the word "manifold" refers to the topology properties of the structure. The approximating manifold is built by a projection pursuit algorithm. At each step of the algorithm, the dimension of the manifold is incremented. Some theoretical properties are provided. In particular, we can show that, at each step of the algorithm, the mean residuals norm is not increased. Moreover, it is also established that the algorithm converges in a finite number of steps. Some particular auto-associative models are exhibited and compared to the classical PCA and some neural networks models. Implementation aspects are discussed. We show that, in numerous cases, no optimization procedure is required. Some illustrations on simulated and real data are presented.
PCA 4 DCA: The Application Of Principal Component Analysis To The Dendritic Cell Algorithm
Gu, Feng, Greensmith, Julie, Oates, Robert, Aickelin, Uwe
As one of the newest members in the field of artificial immune systems (AIS), the Dendritic Cell Algorithm (DCA) is based on behavioural models of natural dendritic cells (DCs). Unlike other AIS, the DCA does not rely on training data, instead domain or expert knowledge is required to predetermine the mapping between input signals from a particular instance to the three categories used by the DCA. This data preprocessing phase has received the criticism of having manually over-fitted the data to the algorithm, which is undesirable. Therefore, in this paper we have attempted to ascertain if it is possible to use principal component analysis (PCA) techniques to automatically categorise input data while still generating useful and accurate classification results. The integrated system is tested with a biometrics dataset for the stress recognition of automobile drivers. The experimental results have shown the application of PCA to the DCA for the purpose of automated data preprocessing is successful.
Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization
Wright, John, Ganesh, Arvind, Rao, Shankar, Peng, Yigang, Ma, Yi
Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. Thispaper considers the idealized "robust principal component analysis" problem of recovering a low rank matrix A from corrupted observations D A E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns bysolving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation spaceand the number of errors E grows in proportion to the total number of entries in the matrix. A byproduct of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction ofits entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision.
Robust Kernel Principal Component Analysis
Nguyen, Minh H., Torre, Fernando
Kernel Principal Component Analysis (KPCA) is a popular generalization of linear PCA that allows non-linear feature extraction. In KPCA, data in the input space is mapped to higher (usually) dimensional feature space where the data can be linearly modeled. The feature space is typically induced implicitly by a kernel function, and linear PCA in the feature space is performed via the kernel trick. However, due to the implicitness of the feature space, some extensions of PCA such as robust PCA cannot be directly generalized to KPCA. This paper presents a technique to overcome this problem, and extends it to a unified framework for treating noise, missing data, and outliers in KPCA. Our method is based on a novel cost function to perform inference in KPCA. Extensive experiments, in both synthetic and real data, show that our algorithm outperforms existing methods.