AITopics | Principal Component Analysis

Collaborating Authors

Principal Component Analysis

News Overviews Instructional Materials AI-Alerts Classics

Modularity Component Analysis versus Principal Component Analysis

arXiv.org Machine LearningFeb-19-2016

In this paper the exact linear relation between the leading eigenvectors of the modularity matrix and the singular vectors of an uncentered data matrix is developed. Based on this analysis the concept of a modularity component is defined, and its properties are developed. It is shown that modularity component analysis can be used to cluster data similar to how traditional principal component analysis is used except that modularity component analysis does not require data centering.

artificial intelligence, machine learning, modularity component, (15 more...)

arXiv.org Machine Learning

doi: 10.11648/j.ajam.20160402.15

1510.05492

Country: North America > United States > North Carolina (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

On the Nystr\"om and Column-Sampling Methods for the Approximate Principal Components Analysis of Large Data Sets

Homrighausen, Darren, McDonald, Daniel J.

arXiv.org Machine LearningFeb-2-2016

In this paper we analyze approximate methods for undertaking a principal components analysis (PCA) on large data sets. PCA is a classical dimension reduction method that involves the projection of the data onto the subspace spanned by the leading eigenvectors of the covariance matrix. This projection can be used either for exploratory purposes or as an input for further analysis, e.g. regression. If the data have billions of entries or more, the computational and storage requirements for saving and manipulating the design matrix in fast memory is prohibitive. Recently, the Nystr\"om and column-sampling methods have appeared in the numerical linear algebra community for the randomized approximation of the singular value decomposition of large matrices. However, their utility for statistical applications remains unclear. We compare these approximations theoretically by bounding the distance between the induced subspaces and the desired, but computationally infeasible, PCA subspace. Additionally we show empirically, through simulations and a real data example involving a corpus of emails, the trade-off of approximation accuracy and computational complexity.

approximation, data mining, machine learning, (22 more...)

arXiv.org Machine Learning

doi: 10.1080/10618600.2014.995799

1602.0112

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Sparse Generalized Principal Component Analysis for Large-scale Applications beyond Gaussianity

Zhang, Qiaoya, She, Yiyuan

arXiv.org Machine LearningJan-27-2016

Principal Component Analysis (PCA) is a dimension reduction technique. It produces inconsistent estimators when the dimensionality is moderate to high, which is often the problem in modern large-scale applications where algorithm scalability and model interpretability are difficult to achieve, not to mention the prevalence of missing values. While existing sparse PCA methods alleviate inconsistency, they are constrained to the Gaussian assumption of classical PCA and fail to address algorithm scalability issues. We generalize sparse PCA to the broad exponential family distributions under high-dimensional setup, with built-in treatment for missing values. Meanwhile we propose a family of iterative sparse generalized PCA (SG-PCA) algorithms such that despite the non-convexity and non-smoothness of the optimization task, the loss function decreases in every iteration. In terms of ease and intuitive parameter tuning, our sparsity-inducing regularization is far superior to the popular Lasso. Furthermore, to promote overall scalability, accelerated gradient is integrated for fast convergence, while a progressive screening technique gradually squeezes out nuisance dimensions of a large-scale problem for feasible optimization. High-dimensional simulation and real data experiments demonstrate the efficiency and efficacy of SG-PCA.

algorithm, dimension, sg-pca, (15 more...)

arXiv.org Machine Learning

1512.03883

Country:

North America > United States > Florida > Leon County > Tallahassee (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Robust PCA with compressed data

Ha, Wooseok, Barber, Rina Foygel

Neural Information Processing SystemsDec-31-2015

The robust principal component analysis (RPCA) problem seeks to separate low-rank trends from sparse outlierswithin a data matrix, that is, to approximate a $n\times d$ matrix $D$ as the sum of a low-rank matrix $L$ and a sparse matrix $S$.We examine the robust principal component analysis (RPCA) problem under data compression, wherethe data $Y$ is approximately given by $(L + S)\cdot C$, that is, a low-rank $+$ sparse data matrix that has been compressed to size $n\times m$ (with $m$ substantially smaller than the original dimension $d$) via multiplication witha compression matrix $C$. We give a convex program for recovering the sparse component $S$ along with the compressed low-rank component $L\cdot C$, along with upper bounds on the error of this reconstructionthat scales naturally with the compression dimension $m$ and coincides with existing results for the uncompressedsetting $m=d$. Our results can also handle error introduced through additive noise or through missing data.The scaling of dimension, compression, and signal complexity in our theoretical results is verified empirically through simulations, and we also apply our method to a data set measuring chlorine concentration acrossa network of sensors, to test its performance in practice.

artificial intelligence, machine learning, matrix, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)

Add feedback

Sufficient Forecasting Using Factor Models

Fan, Jianqing, Xue, Lingzhou, Yao, Jiawei

arXiv.org Machine LearningDec-24-2015

We consider forecasting a single time series when there is a large number of predictors and a possible nonlinear effect. The dimensionality was first reduced via a high-dimensional (approximate) factor model implemented by the principal component analysis. Using the extracted factors, we develop a novel forecasting method called the sufficient forecasting, which provides a set of sufficient predictive indices, inferred from high-dimensional predictors, to deliver additional predictive power. The projected principal component analysis will be employed to enhance the accuracy of inferred factors when a semi-parametric (approximate) factor model is assumed. Our method is also applicable to cross-sectional sufficient regression using extracted factors. The connection between the sufficient forecasting and the deep learning architecture is explicitly stated. The sufficient forecasting correctly estimates projection indices of the underlying factors even in the presence of a nonparametric forecasting function. The proposed method extends the sufficient dimension reduction to high-dimensional regimes by condensing the cross-sectional information through factor models. We derive asymptotic properties for the estimate of the central subspace spanned by these projection directions as well as the estimates of the sufficient predictive indices. We further show that the natural method of running multiple regression of target on estimated factors yields a linear estimate that actually falls into this central subspace. Our method and theory allow the number of predictors to be larger than the number of observations. We finally demonstrate that the sufficient forecasting improves upon the linear forecasting in both simulation studies and an empirical study of forecasting macroeconomic variables.

artificial intelligence, forecasting, machine learning, (20 more...)

arXiv.org Machine Learning

1505.07414

Country: North America > United States (0.93)

Genre: Research Report (0.81)

Industry:

Health & Medicine (1.00)
Banking & Finance > Economy (0.92)
Government (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.44)

Add feedback

Streaming Kernel Principal Component Analysis

Ghashami, Mina, Perry, Daniel, Phillips, Jeff M.

arXiv.org Machine LearningDec-16-2015

Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning. To allow for non-linear relations, typically a full $n \times n$ kernel matrix is constructed over $n$ data points, but this requires too much space and time for large values of $n$. Techniques such as the Nystr\"om method and random feature maps can help towards this goal, but they do not explicitly maintain the basis vectors in a stream and take more space than desired. We propose a new approach for streaming KPCA which maintains a small set of basis elements in a stream, requiring space only logarithmic in $n$, and also improves the dependence on the error parameter. Our technique combines together random feature maps with recent advances in matrix sketching, it has guaranteed spectral norm error bounds with respect to the original kernel matrix, and it compares favorably in practice to state-of-the-art approaches.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1512.05059

Country: North America > United States (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Wishart Mechanism for Differentially Private Principal Components Analysis

Jiang, Wuxuan, Xie, Cong, Zhang, Zhihua

arXiv.org Machine LearningNov-19-2015

We propose a new input perturbation mechanism for publishing a covariance matrix to achieve $(\epsilon,0)$-differential privacy. Our mechanism uses a Wishart distribution to generate matrix noise. In particular, We apply this mechanism to principal component analysis. Our mechanism is able to keep the positive semi-definiteness of the published covariance matrix. Thus, our approach gives rise to a general publishing framework for input perturbation of a symmetric positive semidefinite matrix. Moreover, compared with the classic Laplace mechanism, our method has better utility guarantee. To the best of our knowledge, Wishart mechanism is the best input perturbation approach for $(\epsilon,0)$-differentially private PCA. We also compare our work with previous exponential mechanism algorithms in the literature and provide near optimal bound while having more flexibility and less computational intractability.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

1511.0568

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?

Cardot, Hervé, Degras, David

arXiv.org Machine LearningNov-11-2015

In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to routinely perform tasks like principal component analysis (PCA). Recursive algorithms that update the PCA with each new observation have been studied in various fields of research and found wide applications in industrial monitoring, computer vision, astronomy, and latent semantic indexing, among others. This work provides guidance for selecting an online PCA algorithm in practice. We present the main approaches to online PCA, namely, perturbation techniques, incremental methods, and stochastic optimization, and compare their statistical accuracy, computation time, and memory requirements using artificial and real data. Extensions to missing data and to functional data are discussed. All studied algorithms are available in the R package onlinePCA on CRAN.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1511.03688

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Reduced-Set Kernel Principal Components Analysis for Improving the Training and Execution Speed of Kernel Machines

Kingravi, Hassan A., Vela, Patricio A., Gray, Alexandar

arXiv.org Machine LearningJul-26-2015

This paper presents a practical, and theoretically well-founded, approach to improve the speed of kernel manifold learning algorithms relying on spectral decomposition. Utilizing recent insights in kernel smoothing and learning with integral operators, we propose Reduced Set KPCA (RSKPCA), which also suggests an easy-to-implement method to remove or replace samples with minimal effect on the empirical operator. A simple data point selection procedure is given to generate a substitute density for the data, with accuracy that is governed by a user-tunable parameter . The effect of the approximation on the quality of the KPCA solution, in terms of spectral and operator errors, can be shown directly in terms of the density estimate error and as a function of the parameter . We show in experiments that RSKPCA can improve both training and evaluation time of KPCA by up to an order of magnitude, and compares favorably to the widely-used Nystrom and density-weighted Nystrom methods.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1507.0726

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.41)

Add feedback

Tensor principal component analysis via sum-of-squares proofs

Hopkins, Samuel B., Shi, Jonathan, Steurer, David

arXiv.org Machine LearningJul-12-2015

We study a statistical model for the tensor principal component analysis problem introduced by Montanari and Richard: Given a order-$3$ tensor $T$ of the form $T = \tau \cdot v_0^{\otimes 3} + A$, where $\tau \geq 0$ is a signal-to-noise ratio, $v_0$ is a unit vector, and $A$ is a random noise tensor, the goal is to recover the planted vector $v_0$. For the case that $A$ has iid standard Gaussian entries, we give an efficient algorithm to recover $v_0$ whenever $\tau \geq \omega(n^{3/4} \log(n)^{1/4})$, and certify that the recovered vector is close to a maximum likelihood estimator, all with high probability over the random choice of $A$. The previous best algorithms with provable guarantees required $\tau \geq \Omega(n)$. In the regime $\tau \leq o(n)$, natural tensor-unfolding-based spectral relaxations for the underlying optimization problem break down (in the sense that their integrality gap is large). To go beyond this barrier, we use convex relaxations based on the sum-of-squares method. Our recovery algorithm proceeds by rounding a degree-$4$ sum-of-squares relaxations of the maximum-likelihood-estimation problem for the statistical model. To complement our algorithmic results, we show that degree-$4$ sum-of-squares relaxations break down for $\tau \leq O(n^{3/4}/\log(n)^{1/4})$, which demonstrates that improving our current guarantees (by more than logarithmic factors) would require new techniques or might even be intractable. Finally, we show how to exploit additional problem structure in order to solve our sum-of-squares relaxations, up to some approximation, very efficiently. Our fastest algorithm runs in nearly-linear time using shifted (matrix) power iteration and has similar guarantees as above. The analysis of this algorithm also confirms a variant of a conjecture of Montanari and Richard about singular vectors of tensor unfoldings.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

1507.03269

Country: North America > United States (0.45)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Add feedback