Goto

Collaborating Authors

 Principal Component Analysis


Policy Search with High-Dimensional Context Variables

AAAI Conferences

Direct contextual policy search methods learn to improve policy parameters and simultaneously generalize these parameters to different context or task variables. However, learning from high-dimensional context variables, such as camera images, is still a prominent problem in many real-world tasks. A naive application of unsupervised dimensionality reduction methods to the context variables, such as principal component analysis, is insufficient as task-relevant input may be ignored. In this paper, we propose a contextual policy search method in the model-based relative entropy stochastic search framework with integrated dimensionality reduction. We learn a model of the reward that is locally quadratic in both the policy parameters and the context variables. Furthermore, we perform supervised linear dimensionality reduction on the context variables by nuclear norm regularization. The experimental results show that the proposed method outperforms naive dimensionality reduction via principal component analysis and a state-of-the-art contextual policy search method.


Can You Use Principal Component Analysis with a Training Set Test Set Model?

#artificialintelligence

I recently gave a free webinar on Principal Component Analysis. We had almost 300 researchers attend and didn't get through all the questions. This is part of a series of answers to those questions. If you missed it, you can get the webinar recording here. Principal Component Analysis specifically could be used with a training and test data set, but it doesn't make as much sense as doing so for Factor Analysis.


Applications of electronic noses and tongues in food analysis

AITopics Original Links

This review examines the applications of electronic noses and tongues in food analysis. A brief history of the development of sensors is included and this is illustrated by descriptions of the different types of sensors utilized in these devices. As pattern recognition techniques are widely used to analyse the data obtained from these multisensor arrays, a discussion of principal components analysis and artificial neural networks is essential. An introduction to the integration of electronic tongues and noses is also incorporated and the strengths and weaknesses of both are described. Applications described include identification and classification of flavour and aroma and other measurements of quality using the electronic nose.


Towards multiple kernel principal component analysis for integrative analysis of tumor samples

arXiv.org Machine Learning

Personalized treatment of patients based on tissue-specific cancer subtypes has strongly increased the efficacy of the chosen therapies. Even though the amount of data measured for cancer patients has increased over the last years, most cancer subtypes are still diagnosed based on individual data sources (e.g. gene expression data). We propose an unsupervised data integration method based on kernel principal component analysis. Principal component analysis is one of the most widely used techniques in data analysis. Unfortunately, the straight-forward multiple-kernel extension of this method leads to the use of only one of the input matrices, which does not fit the goal of gaining information from all data sources. Therefore, we present a scoring function to determine the impact of each input matrix. The approach enables visualizing the integrated data and subsequent clustering for cancer subtype identification. Due to the nature of the method, no free parameters have to be set. We apply the methodology to five different cancer data sets and demonstrate its advantages in terms of results and usability.


Correlated-PCA: Principal Components' Analysis when Data and Noise are Correlated

Neural Information Processing Systems

Given a matrix of observed data, Principal Components Analysis (PCA) computes a small number of orthogonal directions that contain most of its variability. Provably accurate solutions for PCA have been in use for a long time. However, to the best of our knowledge, all existing theoretical guarantees for it assume that the data and the corrupting noise are mutually independent, or at least uncorrelated. This is valid in practice often, but not always. In this paper, we study the PCA problem in the setting where the data and noise can be correlated. Such noise is often also referred to as ``data-dependent noise". We obtain a correctness result for the standard eigenvalue decomposition (EVD) based solution to PCA under simple assumptions on the data-noise correlation. We also develop and analyze a generalization of EVD, cluster-EVD, that improves upon EVD in certain regimes.


Policy Search with High-Dimensional Context Variables

arXiv.org Machine Learning

Direct contextual policy search methods learn to improve policy parameters and simultaneously generalize these parameters to different context or task variables. However, learning from high-dimensional context variables, such as camera images, is still a prominent problem in many real-world tasks. A naive application of unsupervised dimensionality reduction methods to the context variables, such as principal component analysis, is insufficient as task-relevant input may be ignored. In this paper, we propose a contextual policy search method in the model-based relative entropy stochastic search framework with integrated dimensionality reduction. We learn a model of the reward that is locally quadratic in both the policy parameters and the context variables. Furthermore, we perform supervised linear dimensionality reduction on the context variables by nuclear norm regularization. The experimental results show that the proposed method outperforms naive dimensionality reduction via principal component analysis and a state-of-the-art contextual policy search method.


Correlated-PCA: Principal Components' Analysis when Data and Noise are Correlated

arXiv.org Machine Learning

Given a matrix of observed data, Principal Components Analysis (PCA) computes a small number of orthogonal directions that contain most of its variability. Provably accurate solutions for PCA have been in use for a long time. However, to the best of our knowledge, all existing theoretical guarantees for it assume that the data and the corrupting noise are mutually independent, or at least uncorrelated. This is valid in practice often, but not always. In this paper, we study the PCA problem in the setting where the data and noise can be correlated. Such noise is often also referred to as "data-dependent noise". We obtain a correctness result for the standard eigenvalue decomposition (EVD) based solution to PCA under simple assumptions on the data-noise correlation. We also develop and analyze a generalization of EVD, cluster-EVD, that improves upon EVD in certain regimes.


Efficient L1-Norm Principal-Component Analysis via Bit Flipping

arXiv.org Machine Learning

It was shown recently that the $K$ L1-norm principal components (L1-PCs) of a real-valued data matrix $\mathbf X \in \mathbb R^{D \times N}$ ($N$ data samples of $D$ dimensions) can be exactly calculated with cost $\mathcal{O}(2^{NK})$ or, when advantageous, $\mathcal{O}(N^{dK - K + 1})$ where $d=\mathrm{rank}(\mathbf X)$, $K


Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis

arXiv.org Machine Learning

Principal component analysis (PCA) is a technique to find orthonormal vectors, which are a linear combination of the attributes of the data, that explain the variance structure of the data [12]. Since a few orthonormal vectors usually explain most of the variance, PCA is often used to reduce dimension of the data by keeping only a few of the orthonormal vectors. These orthonormal vectors are called principal components (PCs). For dimensionality reduction, we are given target dimension p, the number of PCs. To measure accuracy, given p principal components, first, the original data is projected into the lower dimension using the PCs. Next, the projected data in the lower dimension is lifted to the original dimension using the PCs. Observe that this procedure causes loss of some information if p is smaller than the dimension of the original attribute space. The reconstruction error is defined by the difference between the projected-and-lifted data and the original data. To select the best p PCs, the following two objective functions are usually used: [P1] minimization of the reconstruction error, [P2] maximization of the variance of the projected data.


Kernel tricks and nonlinear dimensionality reduction via RBF kernel PCA

#artificialintelligence

Most machine learning algorithms have been developed and statistically validated for linearly separable data. Popular examples are linear classifiers like Support Vector Machines (SVMs) or the (standard) Principal Component Analysis (PCA) for dimensionality reduction. However, most real world data requires nonlinear methods in order to perform tasks that involve the analysis and discovery of patterns successfully. The focus of this article is to briefly introduce the idea of kernel methods and to implement a Gaussian radius basis function (RBF) kernel that is used to perform nonlinear dimensionality reduction via BF kernel principal component analysis (kPCA). The main purpose of principal component analysis (PCA) is the analysis of data to identify patterns that represent the data "well." The principal components can be understood as new axes of the dataset that maximize the variance along those axes (the eigenvectors of the covariance matrix).