AITopics | Principal Component Analysis

Collaborating Authors

Principal Component Analysis

News Overviews Instructional Materials AI-Alerts Classics

Kernel tricks and nonlinear dimensionality reduction via RBF kernel PCA

#artificialintelligenceAug-22-2016, 03:10:28 GMT

Most machine learning algorithms have been developed and statistically validated for linearly separable data. Popular examples are linear classifiers like Support Vector Machines (SVMs) or the (standard) Principal Component Analysis (PCA) for dimensionality reduction. However, most real world data requires nonlinear methods in order to perform tasks that involve the analysis and discovery of patterns successfully. The focus of this article is to briefly introduce the idea of kernel methods and to implement a Gaussian radius basis function (RBF) kernel that is used to perform nonlinear dimensionality reduction via BF kernel principal component analysis (kPCA). The main purpose of principal component analysis (PCA) is the analysis of data to identify patterns that represent the data "well." The principal components can be understood as new axes of the dataset that maximize the variance along those axes (the eigenvectors of the covariance matrix).

artificial intelligence, machine learning, mathbf, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)

Add feedback

Principal Component Analysis

#artificialintelligenceAug-22-2016, 03:10:27 GMT

Principal Component Analysis (PCA) is a simple yet popular and useful linear transformation technique that is used in numerous applications, such as stock market predictions, the analysis of gene expression data, and many more. In this tutorial, we will see that PCA is not just a "black box", and we are going to unravel its internals in 3 basic steps. The sheer size of data in the modern age is not only a challenge for computer hardware but also a main bottleneck for the performance of many machine learning algorithms. The main goal of a PCA analysis is to identify patterns in data; PCA aims to detect the correlation between variables. If a strong correlation between variables exists, the attempt to reduce the dimensionality only makes sense.

artificial intelligence, machine learning, matrix, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Dynamic Principal Component Analysis: Identifying the Relationship between Multiple Air Pollutants

Melnikov, Oleg, Raun, Loren H., Ensor, Katherine B.

arXiv.org Machine LearningAug-9-2016

The dynamic nature of air quality chemistry and transport makes it difficult to identify the mixture of air pollutants for a region. In this study of air quality in the Houston metropolitan area we apply dynamic principal component analysis (DPCA) to a normalized multivariate time series of daily concentration measurements of five pollutants (O3, CO, NO2, SO2, PM2.5) from January 1, 2009 through December 31, 2011 for each of the 24 hours in a day. The resulting dynamic components are examined by hour across days for the 3 year period. Diurnal and seasonal patterns are revealed underlining times when DPCA performs best and two principal components (PCs) explain most variability in the multivariate series. DPCA is shown to be superior to static principal component analysis (PCA) in discovery of linear relations among transformed pollutant measurements. DPCA captures the time-dependent correlation structure of the underlying pollutants recorded at up to 34 monitoring sites in the region. In winter mornings the first principal component (PC1) (mainly CO and NO2) explains up to 70% of variability. Augmenting with the second principal component (PC2) (mainly driven by SO2) the explained variability rises to 90%. In the afternoon, O3 gains prominence in the second principal component. The seasonal profile of PCs' contribution to variance loses its distinction in the afternoon, yet cumulatively PC1 and PC2 still explain up to 65% of variability in ambient air data. DPCA provides a strategy for identifying the changing air quality profile for the region studied.

artificial intelligence, machine learning, principal component analysis, (17 more...)

arXiv.org Machine Learning

1608.03022

Country:

Europe (0.67)
North America > United States > Texas > Harris County > Houston (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.67)
Health & Medicine > Therapeutic Area (0.46)
Materials > Chemicals (0.46)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)

Add feedback

Asymptotic properties of Principal Component Analysis and shrinkage-bias adjustment under the Generalized Spiked Population model

Dey, Rounak, Lee, Seunggeun

arXiv.org Machine LearningJul-28-2016

With the development of high-throughput technologies, principal component analysis (PCA) in the high-dimensional regime is of great interest. Most of the existing theoretical and methodological results for high-dimensional PCA are based on the spiked population model in which all the population eigenvalues are equal except for a few large ones. Due to the presence of local correlation among features, however, this assumption may not be satisfied in many real-world datasets. To address this issue, we investigated the asymptotic behaviors of PCA under the generalized spiked population model. Based on the theoretical results, we proposed a series of methods for the consistent estimation of population eigenvalues, angles between the sample and population eigenvectors, correlation coefficients between the sample and population principal component (PC) scores, and the shrinkage bias adjustment for the predicted PC scores. Using numerical experiments and real data examples from the genetics literature, we showed that our methods can greatly reduce bias and improve prediction accuracy.

artificial intelligence, eigenvalue, machine learning, (16 more...)

arXiv.org Machine Learning

1607.08647

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

A Novel Approach for Phase Identification in Smart Grids Using Graph Theory and Principal Component Analysis

Jayadev, P Satya, Rajeswaran, Aravind, Bhatt, Nirav P, Pasumarthy, Ramkrishna

arXiv.org Machine LearningJun-7-2016

Consumers with low demand, like households, are generally supplied single-phase power by connecting their service mains to one of the phases of a distribution transformer. The distribution companies face the problem of keeping a record of consumer connectivity to a phase due to uninformed changes that happen. The exact phase connectivity information is important for the efficient operation and control of distribution system. We propose a new data driven approach to the problem based on Principal Component Analysis (PCA) and its Graph Theoretic interpretations, using energy measurements in equally timed short intervals, generated from smart meters. We propose an algorithm for inferring phase connectivity from noisy measurements. The algorithm is demonstrated using simulated data for phase connectivities in distribution networks.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Machine Learning

1511.06063

Country:

Europe (0.46)
North America > United States (0.28)

Genre:

Research Report > Promising Solution (0.40)
Overview > Innovation (0.40)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Suppressing Background Radiation Using Poisson Principal Component Analysis

Tandon, P., Huggins, P., Dubrawski, A., Labov, S., Nelson, K.

arXiv.org Machine LearningMay-26-2016

Performance of nuclear threat detection systems based on gamma-ray spectrometry often strongly depends on the ability to identify the part of measured signal that can be attributed to background radiation. We have successfully applied a method based on Principal Component Analysis (PCA) to obtain a compact null-space model of background spectra using PCA projection residuals to derive a source detection score. We have shown the method's utility in a threat detection system using mobile spectrometers in urban scenes (Tandon et al 2012). While it is commonly assumed that measured photon counts follow a Poisson process, standard PCA makes a Gaussian assumption about the data distribution, which may be a poor approximation when photon counts are low. This paper studies whether and in what conditions PCA with a Poisson-based loss function (Poisson PCA) can outperform standard Gaussian PCA in modeling background radiation to enable more sensitive and specific nuclear threat detection.

artificial intelligence, machine learning, principal component analysis, (13 more...)

arXiv.org Machine Learning

1605.08455

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.49)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (0.72)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.63)

Add feedback

Wishart Mechanism for Differentially Private Principal Components Analysis

Jiang, Wuxuan (Shanghai Jiao Tong University) | Xie, Cong (Shanghai Jiao Tong University) | Zhang, Zhihua (Shanghai Jiao Tong University)

AAAI ConferencesApr-19-2016

We propose a new input perturbation mechanism for publishing a covariance matrix to achieve (epsilon,0)-differential privacy. Our mechanism uses a Wishart distribution to generate matrix noise. In particular, we apply this mechanism to principal component analysis (PCA). Our mechanism is able to keep the positive semi-definiteness of the published covariance matrix. Thus, our approach gives rise to a general publishing framework for input perturbation of a symmetric positive semidefinite matrix. Moreover, compared with the classic Laplace mechanism, our method has better utility guarantee. To the best of our knowledge, the Wishart mechanism is the best input perturbation approach for (epsilon,0)-differentially private PCA. We also compare our work with previous exponential mechanism algorithms in the literature and provide near optimal bound while having more flexibility and less computational intractability.

artificial intelligence, machine learning, matrix, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report (0.47)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Principal Component Analysis using R

#artificialintelligenceMar-30-2016, 06:15:29 GMT

Technically speaking, PCA uses orthogonal projection of highly correlated variables to a set of values of linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This linear transformation is defined in such a way that the first principal component has the largest possible variance. It accounts for as much of the variability in the data as possible by considering highly correlated features. Each succeeding component in turn has the highest variance using the features that are less correlated with the first principal component and that are orthogonal to the preceding component.

artificial intelligence, machine learning, principal component analysis, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.40)

Add feedback

Phase transitions and sample complexity in Bayes-optimal matrix factorization

Kabashima, Yoshiyuki, Krzakala, Florent, Mézard, Marc, Sakata, Ayaka, Zdeborová, Lenka

arXiv.org Machine LearningMar-21-2016

We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also important in machine learning: unsupervised representation learning can often be studied through matrix factorization. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.1109/TIT.2016.2556702

1402.1298

Country:

Asia (0.92)
Europe (0.67)
North America > United States > California (0.45)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.44)

Add feedback

Principal Component Projection Without Principal Component Analysis

Frostig, Roy, Musco, Cameron, Musco, Christopher, Sidford, Aaron

arXiv.org Machine LearningFeb-22-2016

We show how to efficiently project a vector onto the top principal components of a matrix, without explicitly computing these components. Specifically, we introduce an iterative algorithm that provably computes the projection using few calls to any black-box routine for ridge regression. By avoiding explicit principal component analysis (PCA), our algorithm is the first with no runtime dependence on the number of top principal components. We show that it can be used to give a fast iterative method for the popular principal component regression problem, giving the first major runtime improvement over the naive method of combining PCA with regression. To achieve our results, we first observe that ridge regression can be used to obtain a "smooth projection" onto the top principal components. We then sharpen this approximation to true projection using a low-degree polynomial approximation to the matrix step function. Step function approximation is a topic of long-term interest in scientific computing. We extend prior theory by constructing polynomials with simple iterative structure and rigorously analyzing their behavior under limited precision.

artificial intelligence, machine learning, regression, (18 more...)

arXiv.org Machine Learning

1602.06872

Country: North America > United States (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback