AITopics | Principal Component Analysis

Collaborating Authors

Principal Component Analysis

News Overviews Instructional Materials AI-Alerts Classics

Sparse Principal Component Analysis with Constraints

Grbovic, Mihajlo (Temple University) | Dance, Christopher Roger (Xerox Research Centre Europe) | Vucetic, Slobodan (Temple University)

AAAI ConferencesJul-21-2012

The sparse principal component analysis is a variant of the classical principal component analysis, which finds linear combinations of a small number of features that maximize variance across data. In this paper we propose a methodology for adding two general types of feature grouping constraints into the original sparse PCA optimization procedure.We derive convex relaxations of the considered constraints, ensuring the convexity of the resulting optimization problem. Empirical evaluation on three real-world problems, one in process monitoring sensor networks and two in social networks, serves to illustrate the usefulness of the proposed methodology.

artificial intelligence, constraint, machine learning, (16 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Tennessee (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)

Add feedback

Regularized Tensor Factorizations and Higher-Order Principal Components Analysis

Allen, Genevera I.

arXiv.org Machine LearningFeb-11-2012

High-dimensional tensors or multi-way data are becoming prevalent in areas such as biomedical imaging, chemometrics, networking and bibliometrics. Traditional approaches to finding lower dimensional representations of tensor data include flattening the data and applying matrix factorizations such as principal components analysis (PCA) or employing tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions. The former can lose important structure in the data, while the latter Higher-Order PCA (HOPCA) methods can be problematic in high-dimensions with many irrelevant features. We introduce frameworks for sparse tensor factorizations or Sparse HOPCA based on heuristic algorithmic approaches and by solving penalized optimization problems related to the CP decomposition. Extensions of these approaches lead to methods for general regularized tensor factorizations, multi-way Functional HOPCA and generalizations of HOPCA for structured data. We illustrate the utility of our methods for dimension reduction, feature selection, and signal recovery on simulated data and multi-dimensional microarrays and functional MRIs.

artificial intelligence, decomposition, machine learning, (15 more...)

arXiv.org Machine Learning

1202.2476

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Large-Scale Sparse Principal Component Analysis with Application to Text Data

Zhang, Youwei, Ghaoui, Laurent E.

Neural Information Processing SystemsDec-31-2011

Sparse PCA provides a linear combination of small number of features that maximizes variance across data. Although Sparse PCA has apparent advantages compared to PCA, such as better interpretability, it is generally thought to be computationally much more expensive. In this paper, we demonstrate the surprising fact that sparse PCA can be easier than PCA in practice, and that it can be reliably applied to very large data sets. This comes from a rigorous feature elimination pre-processing result, coupled with the favorable fact that features in real-life data typically have exponentially decreasing variances, which allows for many features to be eliminated. We introduce a fast block coordinate ascent algorithm with much better computational complexity than the existing first-order ones. We provide experimental results obtained on text corpora involving millions of documents and hundreds of thousands of features. These results illustrate how Sparse PCA can help organize a large corpus of text data in a user-interpretable way, providing an attractive alternative approach to topic models.

algorithm, artificial intelligence, machine learning, (11 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.42)

Add feedback

Demixed Principal Component Analysis

Brendel, Wieland, Romo, Ranulfo, Machens, Christian K.

Neural Information Processing SystemsDec-31-2011

In many experiments, the data points collected live in high-dimensional observation spaces, yet can be assigned a set of labels or parameters. In electrophysiological recordings, for instance, the responses of populations of neurons generally depend on mixtures of experimentally controlled parameters. The heterogeneity and diversity of these parameter dependencies can make visualization and interpretation of such data extremely difficult. Standard dimensionality reduction techniques such as principal component analysis (PCA) can provide a succinct and complete description of the data, but the description is constructed independent of the relevant task variables and is often hard to interpret. Here, we start with the assumption that a particularly informative description is one that reveals the dependency of the high-dimensional data on the individual parameters. We show how to modify the loss function of PCA so that the principal components seek to capture both the maximum amount of variance about the data, while also depending on a minimum number of parameters. We call this method demixed principal component analysis (dPCA) as the principal components here segregate the parameter dependencies. We phrase the problem as a probabilistic graphical model, and present a fast Expectation-Maximization (EM) algorithm. We demonstrate the use of this algorithm for electrophysiological data and show that it serves to demix the parameter-dependence of a neural population response.

artificial intelligence, machine learning, variance, (15 more...)

Neural Information Processing Systems

Country:

Europe (0.47)
North America > Mexico (0.28)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)

Add feedback

Truncated Power Method for Sparse Eigenvalue Problems

Yuan, Xiao-Tong, Zhang, Tong

arXiv.org Artificial IntelligenceDec-12-2011

The sparsity is controlled by the values of k and can be viewed as a design parameter. In machine learning applications, e.g., principal component analysis, this problem is motivated from the following perturbation formulation of matrix A: A Ā E, (1.2) where A is the empirical covariance matrix, Ā is the true covariance matrix, and E is a random perturbation due to having only a finite number of empirical samples. If we assume that the largest eigenvector x of Ā is sparse, then a natural question is to recover x from the noisy observation A when the error E is "small". In this context, the problem (1.1) is also referred to as sparse principal component analysis (sparse PCA). 1 In general, problem (1.1) is non-convex. In fact, it is also NPhard because it can be reduced to the subset selection problem for ordinary least squares regression (Moghaddam et al., 2006), which is known to be NP hard.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1112.2679

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)

Add feedback

Robust Principal Component Analysis with Non-Greedy ℓ1-Norm Maximization

Nie, Feiping (University of Texas at Arlington) | Huang, Heng (University of Texas at Arlington) | Ding, Chris (University of Texas at Arlington) | Luo, Dijun (University of Texas at Arlington) | Wang, Hua (University of Texas at Arlington)

AAAI ConferencesJul-19-2011

Principal Component Analysis (PCA) is one of the most important methods to handle high-dimensional data. However, the high computa-tional complexity makes it hard to apply to the large scale data with high dimensionality, and the used ℓ2-norm makes it sensitive to outliers. A recent work proposed principal component analysis based on ℓ1-norm maximization, which is efficient and robust to outliers. In that work, a greedy strategy was applied due to the difficulty of directly solving the ℓ1-norm maximization problem, which is easy to get stuck in local solution. In this paper, we first propose an efficient optimization algorithm to solve a general ℓ1-norm maximization problem, and then propose a robust principal component analysis with non-greedy ℓ1-norm maximization. Experimental results on real world datasets show that the non-greedy method always obtains much better solution than that of the greedy method.

algorithm, component analysis, principal component analysis, (13 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Texas > Tarrant County > Arlington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (1.00)

Add feedback

High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections

Cocco, Simona, Monasson, Remi, Sessak, Vitor

arXiv.org Machine LearningApr-19-2011

Understanding the patterns of correlations between the components of complex systems is a fundamental issue in various scientific fields, ranging from neurobiology to genomic, from finance to sociology,... A recurrent problem is to distinguish between direct correlations, produced by physiological or functional interactions between the components, and network correlations, which are mediated by other, third-party components. Various approaches have been proposed to infer interactions from correlations, exploiting concepts related to statistical dimensional reduction [1], causality [2], the maximum entropy principle [3], Markov random fields [4]... A major practical and theoretical difficulty in doing so is the paucity and the quality of data: reliable analysis should be able to unveil real patterns of interactions, even if measures are affected by under-or noisy sampling. The size of the interaction network can be comparable to or larger than the number of data, a situation referred to as highdimensional inference. The purpose of the present work is to establish a quantitative correspondence between two of those approaches, namely the inference of Boltzmann Machines (also called Ising model in statistical physics and undirected graphical models for discrete variables in statistical inference [4]) and Principal Component Analysis (PCA) [1]. Inverse Boltzmann Machines (BM) are a mathematically well-founded but computationally challenging approach to infer interactions from correlations.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.83.051123

1104.3665

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Auto-associative models, nonlinear Principal component analysis, manifolds and projection pursuit

Girard, Stéphane, Iovleff, Serge

arXiv.org Machine LearningMar-31-2011

In this paper, auto-associative models are proposed as candidates to the generalization of Principal Component Analysis. We show that these models are dedicated to the approximation of the dataset by a manifold. Here, the word "manifold" refers to the topology properties of the structure. The approximating manifold is built by a projection pursuit algorithm. At each step of the algorithm, the dimension of the manifold is incremented. Some theoretical properties are provided. In particular, we can show that, at each step of the algorithm, the mean residuals norm is not increased. Moreover, it is also established that the algorithm converges in a finite number of steps. Some particular auto-associative models are exhibited and compared to the classical PCA and some neural networks models. Implementation aspects are discussed. We show that, in numerous cases, no optimization procedure is required. Some illustrations on simulated and real data are presented.

artificial intelligence, auto-associative model, machine learning, (15 more...)

arXiv.org Machine Learning

1103.6119

Country:

North America > United States (0.93)
Europe (0.68)

Genre:

Workflow (0.54)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

PCA 4 DCA: The Application Of Principal Component Analysis To The Dendritic Cell Algorithm

Gu, Feng, Greensmith, Julie, Oates, Robert, Aickelin, Uwe

arXiv.org Artificial IntelligenceApr-20-2010

As one of the newest members in the field of artificial immune systems (AIS), the Dendritic Cell Algorithm (DCA) is based on behavioural models of natural dendritic cells (DCs). Unlike other AIS, the DCA does not rely on training data, instead domain or expert knowledge is required to predetermine the mapping between input signals from a particular instance to the three categories used by the DCA. This data preprocessing phase has received the criticism of having manually over-fitted the data to the algorithm, which is undesirable. Therefore, in this paper we have attempted to ascertain if it is possible to use principal component analysis (PCA) techniques to automatically categorise input data while still generating useful and accurate classification results. The integrated system is tested with a biometrics dataset for the stress recognition of automobile drivers. The experimental results have shown the application of PCA to the DCA for the purpose of automated data preprocessing is successful.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1004.346

Country:

North America > United States > Massachusetts (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.49)

Add feedback

Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

Wright, John, Ganesh, Arvind, Rao, Shankar, Peng, Yigang, Ma, Yi

Neural Information Processing SystemsDec-31-2009

Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. Thispaper considers the idealized "robust principal component analysis" problem of recovering a low rank matrix A from corrupted observations D A E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns bysolving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation spaceand the number of errors E grows in proportion to the total number of entries in the matrix. A byproduct of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction ofits entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision.

algorithm, low-rank matrix, matrix, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Illinois (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)

Add feedback