AITopics | Principal Component Analysis

Collaborating Authors

Principal Component Analysis

News Overviews Instructional Materials AI-Alerts Classics

Principal Component Analysis

#artificialintelligenceJan-8-2018, 21:18:24 GMT

In this post, we will learn about Principal Component Analysis (PCA) -- a popular dimensionality reduction technique in Machine Learning. Our goal is to form an intuitive understanding of PCA without going into all the mathematical details.

artificial intelligence, machine learning, principal component, (17 more...)

#artificialintelligence

Country: North America > United States (0.16)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Polar $n$-Complex and $n$-Bicomplex Singular Value Decomposition and Principal Component Pursuit

Chan, Tak-Shing T., Yang, Yi-Hsuan

arXiv.org Machine LearningJan-8-2018

Informed by recent work on tensor singular value decomposition and circulant algebra matrices, this paper presents a new theoretical bridge that unifies the hypercomplex and tensor-based approaches to singular value decomposition and robust principal component analysis. We begin our work by extending the principal component pursuit to Olariu's polar $n$-complex numbers as well as their bicomplex counterparts. In so doing, we have derived the polar $n$-complex and $n$-bicomplex proximity operators for both the $\ell_1$- and trace-norm regularizers, which can be used by proximal optimization methods such as the alternating direction method of multipliers. Experimental results on two sets of audio data show that our algebraically-informed formulation outperforms tensor robust principal component analysis. We conclude with the message that an informed definition of the trace norm can bridge the gap between the hypercomplex and tensor-based approaches. Our approach can be seen as a general methodology for generating other principal component pursuit algorithms with proper algebraic structures.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2016.2612171

1801.03773

Country: Asia > Taiwan (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.93)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.44)

Add feedback

Diffusion Approximations for Online Principal Component Estimation and Global Convergence

Li, Chris Junchi, Wang, Mengdi, Liu, Han, Zhang, Tong

Neural Information Processing SystemsDec-31-2017

In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.

artificial intelligence, iteration, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia > China > Guangdong Province (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

Understanding Dimension Reduction with Principal Component Analysis (PCA)

#artificialintelligenceDec-15-2017, 13:01:36 GMT

Big Data Analytics is a buzzword nowadays. Everyone is talking about it. Big data Analytics has found application in many sectors like medicine, politics, dating. Though big data analytics is used in bettering many aspects of human life, it comes with its own problems. One of them is'Curse of dimensionality'.

artificial intelligence, big data, eigenvector, (17 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.41)

Add feedback

Neural Component Analysis for Fault Detection

Zhao, Haitao

arXiv.org Machine LearningDec-11-2017

Principal component analysis (PCA) is largely adopted for chemical process monitoring and numerous PCA-based systems have been developed to solve various fault detection and diagnosis problems. Since PCA-based methods assume that the monitored process is linear, nonlinear PCA models, such as autoencoder models and kernel principal component analysis (KPCA), has been proposed and applied to nonlinear process monitoring. However, KPCA-based methods need to perform eigen-decomposition (ED) on the kernel Gram matrix whose dimensions depend on the number of training data. Moreover, prefixed kernel parameters cannot be most effective for different faults which may need different parameters to maximize their respective detection performances. Autoencoder models lack the consideration of orthogonal constraints which is crucial for PCA-based algorithms. To address these problems, this paper proposes a novel nonlinear method, called neural component analysis (NCA), which intends to train a feedforward neural work with orthogonal constraints such as those used in PCA. NCA can adaptively learn its parameters through backpropagation and the dimensionality of the nonlinear features has no relationship with the number of training samples. Extensive experimental results on the Tennessee Eastman (TE) benchmark process show the superiority of NCA in terms of missed detection rate (MDR) and false alarm rate (FAR). The source code of NCA can be found in https://github.com/haitaozhao/Neural-Component-Analysis.git.

nca, neural network, survey article, (18 more...)

arXiv.org Machine Learning

1712.04118

Country: North America > United States > Tennessee (0.25)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)

Add feedback

Contrastive Principal Component Analysis

Abid, Abubakar, Zhang, Martin J., Bagaria, Vivek K., Zou, James

arXiv.org Machine LearningNov-21-2017

We present a new technique called contrastive principal component analysis (cPCA) that is designed to discover low-dimensional structure that is unique to a dataset, or enriched in one dataset relative to other data. The technique is a generalization of standard PCA, for the setting where multiple datasets are available -- e.g. a treatment and a control group, or a mixed versus a homogeneous population -- and the goal is to explore patterns that are specific to one of the datasets. We conduct a wide variety of experiments in which cPCA identifies important dataset-specific patterns that are missed by PCA, demonstrating that it is useful for many applications: subgroup discovery, visualizing trends, feature selection, denoising, and data-dependent standardization. We provide geometrical interpretations of cPCA and show that it satisfies desirable theoretical guarantees. We also extend cPCA to nonlinear settings in the form of kernel cPCA. We have released our code as a python package and documentation is on Github.

artificial intelligence, dataset, health & medicine, (14 more...)

arXiv.org Machine Learning

1709.06716

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.86)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

Add feedback

Lazy stochastic principal component analysis

Wojnowicz, Michael, Nguyen, Dinh, Li, Li, Zhao, Xuan

arXiv.org Machine LearningSep-21-2017

Stochastic principal component analysis (SPCA) has become a popular dimensionality reduction strategy for large, high-dimensional datasets. We derive a simplified algorithm, called Lazy SPCA, which has reduced computational complexity and is better suited for large-scale distributed computation. We prove that SPCA and Lazy SPCA find the same approximations to the principal subspace, and that the pairwise distances between samples in the lower-dimensional space is invariant to whether SPCA is executed lazily or not. Empirical studies find downstream predictive performance to be identical for both methods, and superior to random projections, across a range of predictive models (linear regression, logistic lasso, and random forests). In our largest experiment with 4.6 million samples, Lazy SPCA reduced 43.7 hours of computation to 9.9 hours. Overall, Lazy SPCA relies exclusively on matrix multiplications, besides an operation on a small square matrix whose size depends only on the target dimensionality.

artificial intelligence, spca, survey article, (18 more...)

arXiv.org Machine Learning

1709.07175

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

Add feedback

Bayesian nonparametric Principal Component Analysis

Elvira, Clément, Chainais, Pierre, Dobigeon, Nicolas

arXiv.org Machine LearningSep-17-2017

Principal component analysis (PCA) is very popular to perform dimension reduction. The selection of the number of significant components is essential but often based on some practical heuristics depending on the application. Only few works have proposed a probabilistic approach able to infer the number of significant components. To this purpose, this paper introduces a Bayesian nonparametric principal component analysis (BNP-PCA). The proposed model projects observations onto a random orthogonal basis which is assigned a prior distribution defined on the Stiefel manifold. The prior on factor scores involves an Indian buffet process to model the uncertainty related to the number of components. The parameters of interest as well as the nuisance parameters are finally inferred within a fully Bayesian framework via Monte Carlo sampling. A study of the (in-)consistence of the marginal maximum a posteriori estimator of the latent dimension is carried out. A new estimator of the subspace dimension is proposed. Moreover, for sake of statistical significance, a Kolmogorov-Smirnov test based on the posterior distribution of the principal components is used to refine this estimate. The behaviour of the algorithm is first studied on various synthetic examples. Finally, the proposed BNP dimension reduction approach is shown to be easily yet efficiently coupled with clustering or latent factor models within a unique framework.

bayesian inference, posterior distribution, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

1709.05667

Country: North America > United States > New York (0.14)

Genre: Research Report (0.83)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)

Add feedback

Informed Non-convex Robust Principal Component Analysis with Features

Xue, Niannan, Deng, Jiankang, Panagakis, Yannis, Zafeiriou, Stefanos

arXiv.org Machine LearningSep-14-2017

We revisit the problem of robust principal component analysis with features acting as prior side information. To this aim, a novel, elegant, non-convex optimization approach is proposed to decompose a given observation matrix into a low-rank core and the corresponding sparse residual. Rigorous theoretical analysis of the proposed algorithm results in exact recovery guarantees with low computational complexity. Aptly designed synthetic experiments demonstrate that our method is the first to wholly harness the power of non-convexity over convexity in terms of both recoverability and speed. That is, the proposed non-convex approach is more accurate and faster compared to the best available algorithms for the problem under study. Two real-world applications, namely image classification and face denoising further exemplify the practical superiority of the proposed method.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1709.04836

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Manifold Learning Using Kernel Density Estimation and Local Principal Components Analysis

Mohammed, Kitty, Narayanan, Hariharan

arXiv.org Machine LearningSep-11-2017

We consider the problem of recovering a $d-$dimensional manifold $\mathcal{M} \subset \mathbb{R}^n$ when provided with noiseless samples from $\mathcal{M}$. There are many algorithms (e.g., Isomap) that are used in practice to fit manifolds and thus reduce the dimensionality of a given data set. Ideally, the estimate $\mathcal{M}_\mathrm{put}$ of $\mathcal{M}$ should be an actual manifold of a certain smoothness; furthermore, $\mathcal{M}_\mathrm{put}$ should be arbitrarily close to $\mathcal{M}$ in Hausdorff distance given a large enough sample. Generally speaking, existing manifold learning algorithms do not meet these criteria. Fefferman, Mitter, and Narayanan (2016) have developed an algorithm whose output is provably a manifold. The key idea is to define an approximate squared-distance function (asdf) to $\mathcal{M}$. Then, $\mathcal{M}_\mathrm{put}$ is given by the set of points where the gradient of the asdf is orthogonal to the subspace spanned by the largest $n - d$ eigenvectors of the Hessian of the asdf. As long as the asdf meets certain regularity conditions, $\mathcal{M}_\mathrm{put}$ is a manifold that is arbitrarily close in Hausdorff distance to $\mathcal{M}$. In this paper, we define two asdfs that can be calculated from the data and show that they meet the required regularity conditions. The first asdf is based on kernel density estimation, and the second is based on estimation of tangent spaces using local principal components analysis.

artificial intelligence, machine learning, manifold, (16 more...)

arXiv.org Machine Learning

1709.03615

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > India > Maharashtra (0.14)

Genre: Research Report (0.63)

Industry: Education (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Add feedback