Goto

Collaborating Authors

 kernel principal component analysis


Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis

Neural Information Processing Systems

The remarkable success of transformers in sequence modeling tasks, spanning various applications in natural language processing and computer vision, is attributed to the critical role of self-attention. Similar to the development of most deep learning models, the construction of these attention mechanisms relies on heuristics and experience. In our work, we derive self-attention from kernel principal component analysis (kernel PCA) and show that self-attention projects its query vectors onto the principal component axes of its key matrix in a feature space. We then formulate the exact formula for the value matrix in self-attention, theoretically and empirically demonstrating that this value matrix captures the eigenvectors of the Gram matrix of the key vectors in self-attention. Leveraging our kernel PCA framework, we propose Attention with Robust Principal Components (RPC-Attention), a novel class of robust attention that is resilient to data contamination.


The Stability of Kernel Principal Components Analysis and its Relation to the Process Eigenspectrum

Neural Information Processing Systems

In this paper we analyze the relationships between the eigenvalues of the m x m Gram matrix K for a kernel k(·, .) We bound the dif(cid:173) ferences between the two spectra and provide a performance bound on kernel peA.


On the Convergence of Eigenspaces in Kernel Principal Component Analysis

Neural Information Processing Systems

This paper presents a non-asymptotic statistical analysis of Kernel-PCA with a focus different from the one proposed in previous work on this topic. Here instead of considering the reconstruction error of KPCA we are interested in approximation error bounds for the eigenspaces themselves. We prove an upper bound depending on the spacing between eigenvalues but not on the dimensionality of the eigenspace. As a consequence this allows to infer stability results for these estimated spaces.


Robust Kernel Principal Component Analysis

Neural Information Processing Systems

Kernel Principal Component Analysis (KPCA) is a popular generalization of linear PCA that allows non-linear feature extraction. In KPCA, data in the input space is mapped to higher (usually) dimensional feature space where the data can be linearly modeled. The feature space is typically induced implicitly by a kernel function, and linear PCA in the feature space is performed via the kernel trick. However, due to the implicitness of the feature space, some extensions of PCA such as robust PCA cannot be directly generalized to KPCA. This paper presents a technique to overcome this problem, and extends it to a unified framework for treating noise, missing data, and outliers in KPCA.


Robust Kernel Principal Component Analysis

Nguyen, Minh H., Torre, Fernando

Neural Information Processing Systems

Kernel Principal Component Analysis (KPCA) is a popular generalization of linear PCA that allows non-linear feature extraction. In KPCA, data in the input space is mapped to higher (usually) dimensional feature space where the data can be linearly modeled. The feature space is typically induced implicitly by a kernel function, and linear PCA in the feature space is performed via the kernel trick. However, due to the implicitness of the feature space, some extensions of PCA such as robust PCA cannot be directly generalized to KPCA. This paper presents a technique to overcome this problem, and extends it to a unified framework for treating noise, missing data, and outliers in KPCA. Our method is based on a novel cost function to perform inference in KPCA.


Fault Detection Using Nonlinear Low-Dimensional Representation of Sensor Data

Shen, Kai, Mcguirk, Anya, Liao, Yuwei, Chaudhuri, Arin, Kakde, Deovrat

arXiv.org Machine Learning

Recent advances in many enabling technologies such as sensing, computing and communication are instrumental in achieving this objective. Real-time health monitoring enables transitioning from traditional fixed schedule preventive maintenance to predictive maintenance, where decisions regarding maintenance are based on an objective assessment of the equipment health. The reduction in price of sensors has enabled widespread adoption of sensor technology for health monitoring. In 2004 the average cost of sensors was $1.30 and in the year 2020, it is expected to come down to $0.38 [1]. Industries such as mining, transportation, and aerospace are among the leaders in adoption of sensor-enabled predictive maintenance.


Python for Machine Learning and Data Mining

#artificialintelligence

Data Mining and Machine Learching are a hot topics on business intelligence strategy on many companies in the world. These fields give to data scientists the opportunity to explore on a deep way the data, finding new valuable information and constructing intelligence algorithms who can "learn" since the data and make optimal decisions for classification or forecasting tasks. This course is focused on practical approach, so i'll supply you useful snippet codes and i'll teach you how to build professional desktop applications for machine learning and datamining with python language. We'll also manage real data from an example of a real trading company and presenting our results in a professional view with very illustrated graphical charts. We'll initiate at the basic level covering the main topics of Python Language and also the needing programs to develop our applications.


Towards multiple kernel principal component analysis for integrative analysis of tumor samples

Speicher, Nora K., Pfeifer, Nico

arXiv.org Machine Learning

Personalized treatment of patients based on tissue-specific cancer subtypes has strongly increased the efficacy of the chosen therapies. Even though the amount of data measured for cancer patients has increased over the last years, most cancer subtypes are still diagnosed based on individual data sources (e.g. gene expression data). We propose an unsupervised data integration method based on kernel principal component analysis. Principal component analysis is one of the most widely used techniques in data analysis. Unfortunately, the straight-forward multiple-kernel extension of this method leads to the use of only one of the input matrices, which does not fit the goal of gaining information from all data sources. Therefore, we present a scoring function to determine the impact of each input matrix. The approach enables visualizing the integrated data and subsequent clustering for cancer subtype identification. Due to the nature of the method, no free parameters have to be set. We apply the methodology to five different cancer data sets and demonstrate its advantages in terms of results and usability.


Theory of matching pursuit

Hussain, Zakria, Shawe-taylor, John S.

Neural Information Processing Systems

We analyse matching pursuit for kernel principal components analysis (KPCA) by proving that the sparse subspace it produces is a sample compression scheme. We show that this bound is tighter than the KPCA bound of Shawe-Taylor et al [7] and highly predictive of the size of the subspace needed to capture most of the variance in the data. We analyse a second matching pursuit algorithm called kernel matching pursuit (KMP) which does not correspond to a sample compression scheme. However, we give a novel bound that views the choice of subspace of the KMP algorithm as a compression scheme and hence provide a VC bound to upper bound its future loss. Finally we describe how the same bound can be applied to other matching pursuit related algorithms.


Robust Kernel Principal Component Analysis

Nguyen, Minh H., Torre, Fernando

Neural Information Processing Systems

Kernel Principal Component Analysis (KPCA) is a popular generalization of linear PCA that allows non-linear feature extraction. In KPCA, data in the input space is mapped to higher (usually) dimensional feature space where the data can be linearly modeled. The feature space is typically induced implicitly by a kernel function, and linear PCA in the feature space is performed via the kernel trick. However, due to the implicitness of the feature space, some extensions of PCA such as robust PCA cannot be directly generalized to KPCA. This paper presents a technique to overcome this problem, and extends it to a unified framework for treating noise, missing data, and outliers in KPCA. Our method is based on a novel cost function to perform inference in KPCA. Extensive experiments, in both synthetic and real data, show that our algorithm outperforms existing methods.