AITopics | Principal Component Analysis

Collaborating Authors

Principal Component Analysis

News Overviews Instructional Materials AI-Alerts Classics

Demixed Principal Component Analysis

Neural Information Processing SystemsApr-6-2023, 13:12:11 GMT

In many experiments, the data points collected live in high-dimensional observation spaces, yet can be assigned a set of labels or parameters. In electrophysiological recordings, for instance, the responses of populations of neurons generally depend on mixtures of experimentally controlled parameters. The heterogeneity and diversity of these parameter dependencies can make visualization and interpretation of such data extremely difficult. Standard dimensionality reduction techniques such as principal component analysis (PCA) can provide a succinct and complete description of the data, but the description is constructed independent of the relevant task variables and is often hard to interpret. Here, we start with the assumption that a particularly informative description is one that reveals the dependency of the high-dimensional data on the individual parameters.

artificial intelligence, demixed principal component analysis, machine learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.68)

Add feedback

Large-Scale Sparse Principal Component Analysis with Application to Text Data

Neural Information Processing SystemsApr-6-2023, 12:47:16 GMT

Sparse PCA provides a linear combination of small number of features that maximizes variance across data. Although Sparse PCA has apparent advantages compared to PCA, such as better interpretability, it is generally thought to be computationally much more expensive. In this paper, we demonstrate the surprising fact that sparse PCA can be easier than PCA in practice, and that it can be reliably applied to very large data sets. This comes from a rigorous feature elimination pre-processing result, coupled with the favorable fact that features in real-life data typically have exponentially decreasing variances, which allows for many features to be eliminated. We introduce a fast block coordinate ascent algorithm with much better computational complexity than the existing first-order ones.

application, large-scale sparse principal component analysis, text data, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)

Add feedback

Near-optimal Differentially Private Principal Components

Neural Information Processing SystemsApr-6-2023, 12:43:35 GMT

Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data sets in high dimension. Many current data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output.

artificial intelligence, machine learning, principal component analysis, (3 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Security & Privacy (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.30)

Add feedback

Semiparametric Principal Component Analysis

Neural Information Processing SystemsApr-6-2023, 12:32:52 GMT

We propose two new principal component analysis methods in this paper utilizing a semiparametric model. The according methods are named Copula Component Analysis (COCA) and Copula PCA. The semiparametric model assumes that, af- ter unspecified marginally monotone transformations, the distributions are multi- variate Gaussian. The robust nonparametric rank-based correlation coefficient estimator, Spearman's rho, is exploited in estimation. We prove that, under suitable condi- tions, although the marginal distributions can be arbitrarily continuous, the COCA and Copula PCA estimators obtain fast estimation rates and are feature selection consistent in the setting where the dimension is nearly exponentially large relative to the sample size.

artificial intelligence, machine learning, semiparametric principal component analysis

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.66)

Add feedback

Probabilistic Principal Geodesic Analysis

Neural Information Processing SystemsApr-6-2023, 12:04:16 GMT

Principal geodesic analysis (PGA) is a generalization of principal component analysis (PCA) for dimensionality reduction of data on a Riemannian manifold. Currently PGA is defined as a geometric fit to the data, rather than as a probabilistic model. Inspired by probabilistic PCA, we present a latent variable model for PGA that provides a probabilistic framework for factor analysis on manifolds. To compute maximum likelihood estimates of the parameters in our model, we develop a Monte Carlo Expectation Maximization algorithm, where the expectation is approximated by Hamiltonian Monte Carlo sampling of the latent variables. We demonstrate the ability of our method to recover the ground truth parameters in simulated sphere data, as well as its effectiveness in analyzing shape variability of a corpus callosum data set from human brain images.

machine learning, principal component analysis, probabilistic principal geodesic analysis, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.31)

Add feedback

Robust Transfer Principal Component Analysis with Rank Constraints

Neural Information Processing SystemsApr-6-2023, 12:02:44 GMT

Principal component analysis (PCA), a well-established technique for data analysis and processing, provides a convenient form of dimensionality reduction that is effective for cleaning small Gaussian noises presented in the data. However, the applicability of standard principal component analysis in real scenarios is limited by its sensitivity to large errors. In this paper, we tackle the challenge problem of recovering data corrupted with errors of high magnitude by developing a novel robust transfer principal component analysis method. Our method is based on the assumption that useful information for the recovery of a corrupted data matrix can be gained from an uncorrupted related data matrix. Specifically, we formulate the data recovery problem as a joint robust principal component analysis problem on the two data matrices, with shared common principal components across matrices and individual principal components specific to each data matrix.

artificial intelligence, machine learning, robust transfer principal component analysis, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (1.00)

Add feedback

On the Sample Complexity of Subspace Learning

Neural Information Processing SystemsApr-6-2023, 11:59:05 GMT

A large number of algorithms in machine learning, from principal component analysis (PCA), and its non-linear (kernel) extensions, to more recent spectral embedding and support estimation methods, rely on estimating a linear subspace from samples. In this paper we introduce a general formulation of this problem and derive novel learning error estimates. Our results rely on natural assumptions on the spectral properties of the covariance operator associated to the data distribution, and hold for a wide class of metrics between subspaces. As special cases, we discuss sharp error estimates for the reconstruction properties of PCA and spectral support estimation. Key to our analysis is an operator theoretic approach that has broad applicability to spectral learning methods.

artificial intelligence, machine learning, subspace learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.31)

Add feedback

Online Robust PCA via Stochastic Optimization

Neural Information Processing SystemsApr-6-2023, 11:52:20 GMT

Robust PCA methods are typically based on batch optimization and have to load all the samples into memory. This prevents them from efficiently processing big data. In this paper, we develop an Online Robust Principal Component Analysis (OR-PCA) that processes one sample per time instance and hence its memory cost is independent of the data size, significantly enhancing the computation and storage efficiency. The proposed method is based on stochastic optimization of an equivalent reformulation of the batch RPCA method. Indeed, we show that OR-PCA provides a sequence of subspace estimations converging to the optimum of its batch counterpart and hence is provably robust to sparse corruption.

artificial intelligence, machine learning, stochastic optimization, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.30)

Add feedback

Principal Component Analysis based frameworks for efficient missing data imputation algorithms

Nguyen, Thu, Ly, Hoang Thien, Riegler, Michael Alexander, Halvorsen, Pål, Hammer, Hugo L.

arXiv.org Artificial IntelligenceMar-19-2023

Missing data is a commonly occurring problem in practice. Many imputation methods have been developed to fill in the missing entries. However, not all of them can scale to high-dimensional data, especially the multiple imputation techniques. Meanwhile, the data nowadays tends toward high-dimensional. Therefore, in this work, we propose Principal Component Analysis Imputation (PCAI), a simple but versatile framework based on Principal Component Analysis (PCA) to speed up the imputation process and alleviate memory issues of many available imputation techniques, without sacrificing the imputation quality in term of MSE. In addition, the frameworks can be used even when some or all of the missing features are categorical, or when the number of missing features is large. Next, we introduce PCA Imputation - Classification (PIC), an application of PCAI for classification problems with some adjustments. We validate our approach by experiments on various scenarios, which shows that PCAI and PIC can work with various imputation algorithms, including the state-of-the-art ones and improve the imputation speed significantly, while achieving competitive mean square error/classification accuracy compared to direct imputation (i.e., impute directly on the missing data).

artificial intelligence, data quality, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2205.1515

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)

Add feedback

New applications of Principal Component Analysis(PCA) part2(Machine Learning)

#artificialintelligenceMar-13-2023, 06:55:19 GMT

Abstract: The Laser Interferometer Space Antenna (LISA) will provide us with a unique opportunity to observe the early inspiral phase of supermassive binary black holes (SMBBHs) in the mass range of 105 106M, that lasts for several years. It will also detect the merger and ringdown phases of these sources. Therefore, such sources are extremely useful for multiparameter tests of general relativity (GR), where parametrized deviations from GR at multiple post-Newtonian orders are simultaneously measured, thus allowing for a rigorous test of GR. However, the correlations of the deviation parameters with the intrinsic parameters of the system make multiparameter tests extremely challenging to perform. We demonstrate the use of principal component analysis (PCA) to obtain a new set of deviation parameters, which are best-measured orthogonal linear combinations of the original deviation parameters.

artificial intelligence, machine learning, survey article, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.68)

Add feedback