AITopics

1402.4507

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Neural Information Processing SystemsDec-31-2013

Robust Transfer Principal Component Analysis with Rank Constraints

Guo, Yuhong

Principal component analysis (PCA), a well-established technique for data analysis andprocessing, provides a convenient form of dimensionality reduction that is effective for cleaning small Gaussian noises presented in the data. However, the applicability of standard principal component analysis in real scenarios is limited by its sensitivity to large errors. In this paper, we tackle the challenge problem of recovering data corrupted with errors of high magnitude by developing a novel robust transfer principal component analysis method. Our method is based on the assumption that useful information for the recovery of a corrupted data matrix can be gained from an uncorrupted related data matrix. Specifically, we formulate the data recovery problem as a joint robust principal component analysis problem on the two data matrices, with common principal components shared across matrices and individual principal components specific to each data matrix. The formulated optimization problem is a minimization problem over a convex objective function but with non-convex rank constraints. We develop an efficient proximal projected gradient descent algorithm to solve the proposed optimization problem with convergence guarantees.Our empirical results over image denoising tasks show the proposed method can effectively recover images with random large errors, and significantly outperformboth standard PCA and robust PCA with rank constraints.

artificial intelligence, machine learning, matrix, (14 more...)

Country: North America > United States > New York (0.15)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (1.00)

arXiv.org Machine LearningDec-20-2013

Large-Scale Paralleled Sparse Principal Component Analysis

Liu, W., Zhang, H., Tao, D., Wang, Y., Lu, K.

Principal component analysis (PCA) is a statistical technique commonly used in multivariate data analysis. However, PCA can be difficult to interpret and explain since the principal components (PCs) are linear combinations of the original variables. Sparse PCA (SPCA) aims to balance statistical fidelity and interpretability by approximating sparse PCs whose projections capture the maximal variance of original data. In this paper we present an efficient and paralleled method of SPCA using graphics processing units (GPUs), which can process large blocks of data in parallel. Specifically, we construct parallel implementations of the four optimization formulations of the generalized power method of SPCA (GP-SPCA), one of the most efficient and effective SPCA approaches, on a GPU. The parallel GPU implementation of GP-SPCA (using CUBLAS) is up to eleven times faster than the corresponding CPU implementation (using CBLAS), and up to 107 times faster than a MatLab implementation. Extensive comparative experiments in several real-world datasets confirm that SPCA offers a practical advantage.

artificial intelligence, implementation, machine learning, (14 more...)

1312.6182

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.63)

arXiv.org Machine LearningSep-11-2013

Sparse and Functional Principal Components Analysis

Allen, Genevera I.

Regularized principal components analysis, especially Sparse PCA and Functional PCA, has become widely used for dimension reduction in high-dimensional settings. Many examples of massive data, however, may benefit from estimating both sparse AND functional factors. These include neuroimaging data where there are discrete brain regions of activation (sparsity) but these regions tend to be smooth spatially (functional). Here, we introduce an optimization framework that can encourage both sparsity and smoothness of the row and/or column PCA factors. This framework generalizes many of the existing approaches to Sparse PCA, Functional PCA and two-way Sparse PCA and Functional PCA, as these are all special cases of our method. In particular, our method permits flexible combinations of sparsity and smoothness that lead to improvements in feature selection and signal recovery as well as more interpretable PCA factors. We demonstrate our method on simulated data and a neuroimaging example on EEG data. This work provides a unified framework for regularized PCA that can form the foundation for a cohesive approach to regularization in high-dimensional multivariate analysis.

artificial intelligence, machine learning, penalty, (18 more...)

1309.2895

Genre: Research Report (0.63)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

arXiv.org Machine LearningJun-29-2013

Sparse Principal Component Analysis for High Dimensional Vector Autoregressive Models

Wang, Zhaoran, Han, Fang, Liu, Han

We study sparse principal component analysis for high dimensional vector autoregressive time series under a doubly asymptotic framework, which allows the dimension $d$ to scale with the series length $T$. We treat the transition matrix of time series as a nuisance parameter and directly apply sparse principal component analysis on multivariate time series as if the data are independent. We provide explicit non-asymptotic rates of convergence for leading eigenvector estimation and extend this result to principal subspace estimation. Our analysis illustrates that the spectral norm of the transition matrix plays an essential role in determining the final rates. We also characterize sufficient conditions under which sparse principal component analysis attains the optimal parametric rate. Our theoretical results are backed up by thorough numerical studies.

artificial intelligence, machine learning, principal component analysis, (17 more...)

1307.0164

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.47)
Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (1.00)

Neural Information Processing SystemsDec-31-2012

Semiparametric Principal Component Analysis

Han, Fang, Liu, Han

eigenvector, matrix, spearman, (12 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.47)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

Neural Information Processing SystemsDec-31-2012

Semiparametric Principal Component Analysis

Han, Fang, Liu, Han

eigenvector, matrix, spearman, (12 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.47)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

Neural Information Processing SystemsDec-31-2012

Semiparametric Principal Component Analysis

Han, Fang, Liu, Han

We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, after unspecifiedmarginally monotone transformations, the distributions are multivariate Gaussian.The COCA and Copula PCA accordingly estimate the leading eigenvectors of the correlation and covariance matrices of the latent Gaussian distribution. Therobust nonparametric rank-based correlation coefficient estimator, Spearman's rho, is exploited in estimation. We prove that, under suitable conditions, althoughthe marginal distributions can be arbitrarily continuous, the COCA and Copula PCA estimators obtain fast estimation rates and are feature selection consistent in the setting where the dimension is nearly exponentially large relative to the sample size. Careful numerical experiments on the synthetic and real data are conducted to back up the theoretical results. We also discuss the relationship with the transelliptical component analysis proposed by Han and Liu (2012).

artificial intelligence, eigenvector, machine learning, (13 more...)

Country: North America > United States (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

arXiv.org Machine LearningNov-30-2012

A recursive divide-and-conquer approach for sparse principal component analysis

Zhao, Qian, Meng, Deyu, Xu, Zongben

In this paper, a new method is proposed for sparse PCA based on the recursive divide-and-conquer methodology. The main idea is to separate the original sparse PCA problem into a series of much simpler sub-problems, each having a closed-form solution. By recursively solving these sub-problems in an analytical way, an efficient algorithm is constructed to solve the sparse PCA problem. The algorithm only involves simple computations and is thus easy to implement. The proposed method can also be very easily extended to other sparse PCA problems with certain constraints, such as the nonnegative sparse PCA problem. Furthermore, we have shown that the proposed algorithm converges to a stationary point of the problem, and its computational complexity is approximately linear in both data size and dimensionality. The effectiveness of the proposed method is substantiated by extensive experiments implemented on a series of synthetic and real data in both reconstruction-error-minimization and data-variance-maximization viewpoints.

artificial intelligence, machine learning, sparse pc, (14 more...)

1211.7219

Country: North America > United States > Massachusetts > Middlesex County (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.41)

Zhang, Youwei, Ghaoui, Laurent El

Large-Scale Sparse Principal Component Analysis with Application to Text Data

arXiv.org Machine LearningOct-26-2012

Sparse PCA provides a linear combination of small number of features that maximizes variance across data. Although Sparse PCA has apparent advantages compared to PCA, such as better interpretability, it is generally thought to be computationally much more expensive. In this paper, we demonstrate the surprising fact that sparse PCA can be easier than PCA in practice, and that it can be reliably applied to very large data sets. This comes from a rigorous feature elimination pre-processing result, coupled with the favorable fact that features in real-life data typically have exponentially decreasing variances, which allows for many features to be eliminated. We introduce a fast block coordinate ascent algorithm with much better computational complexity than the existing first-order ones. We provide experimental results obtained on text corpora involving millions of documents and hundreds of thousands of features. These results illustrate how Sparse PCA can help organize a large corpus of text data in a user-interpretable way, providing an attractive alternative approach to topic models.

algorithm, artificial intelligence, machine learning, (11 more...)

1210.7054

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.42)