Goto

Collaborating Authors

 Principal Component Analysis


Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction

arXiv.org Artificial Intelligence

Monotone missing data is a common problem in data analysis. However, imputation combined with dimensionality reduction can be computationally expensive, especially with the increasing size of datasets. To address this issue, we propose a Blockwise principal component analysis Imputation (BPI) framework for dimensionality reduction and imputation of monotone missing data. The framework conducts Principal Component Analysis (PCA) on the observed part of each monotone block of the data and then imputes on merging the obtained principal components using a chosen imputation technique. BPI can work with various imputation techniques and can significantly reduce imputation time compared to conducting dimensionality reduction after imputation. This makes it a practical and efficient approach for large datasets with monotone missing data. Our experiments validate the improvement in speed. In addition, our experiments also show that while applying MICE imputation directly on missing data may not yield convergence, applying BPI with MICE for the data may lead to convergence.


Theoretical Guarantees for Sparse Principal Component Analysis based on the Elastic Net

arXiv.org Machine Learning

Sparse principal component analysis (SPCA) is widely used for dimensionality reduction and feature extraction in high-dimensional data analysis. Despite many methodological and theoretical developments in the past two decades, the theoretical guarantees of the popular SPCA algorithm proposed by Zou, Hastie & Tibshirani (2006) are still unknown. This paper aims to address this critical gap. We first revisit the SPCA algorithm of Zou et al. (2006) and present our implementation. We also study a computationally more efficient variant of the SPCA algorithm in Zou et al. (2006) that can be considered as the limiting case of SPCA. We provide the guarantees of convergence to a stationary point for both algorithms and prove that, under a sparse spiked covariance model, both algorithms can recover the principal subspace consistently under mild regularity conditions. We show that their estimation error bounds match the best available bounds of existing works or the minimax rates up to some logarithmic factors. Moreover, we demonstrate the competitive numerical performance of both algorithms in numerical studies.


Node Splitting: A Constructive Algorithm for Feed-Forward Neural Networks

Neural Information Processing Systems

A constructive algorithm is proposed for feed-forward neural networks, which uses node-splitting in the hidden layers to build large networks from smaller ones. The small network forms an approximate model of a set of training data, and the split creates a larger more powerful network which is initialised with the approximate solution already found. The insufficiency of the smaller network in modelling the system which generated the data leads to oscillation in those hidden nodes whose weight vectors cover re(cid:173) gions in the input space where more detail is required in the model. These nodes are identified and split in two using principal component analysis, allowing the new nodes t.o cover the two main modes of each oscillating vector. Nodes are selected for splitting using principal component analysis on the oscillating weight vectors, or by examining the Hessian matrix of second derivatives of the network error with respect to the weight.s.


Self-Organizing Rules for Robust Principal Component Analysis

Neural Information Processing Systems

In the presence of outliers, the existing self-organizing rules for Principal Component Analysis (PCA) perform poorly. Using sta(cid:173) tistical physics techniques including the Gibbs distribution, binary decision fields and effective energies, we propose self-organizing PCA rules which are capable of resisting outliers while fulfilling various PCA-related tasks such as obtaining the first principal com(cid:173) ponent vector, the first k principal component vectors, and directly finding the subspace spanned by the first k vector principal com(cid:173) ponent vectors without solving for each vector individually. Com(cid:173) parative experiments have shown that the proposed robust rules improve the performances of the existing PCA algorithms signifi(cid:173) cantly when outliers are present.


Unsupervised Parallel Feature Extraction from First Principles

Neural Information Processing Systems

We describe a number of learning rules that can be used to train un(cid:173) supervised parallel feature extraction systems. The learning rules are derived using gradient ascent of a quality function. We con(cid:173) sider a number of quality functions that are rational functions of higher order moments of the extracted feature values. We show that one system learns the principle components of the correla(cid:173) tion matrix. Principal component analysis systems are usually not optimal feature extractors for classification.


Non-Linear Statistical Analysis and Self-Organizing Hebbian Networks

Neural Information Processing Systems

Neurons learning under an unsupervised Hebbian learning rule can perform a nonlinear generalization of principal component analysis. This relationship between nonlinear PCA and nonlinear neurons is reviewed. The stable fixed points of the neuron learning dynamics correspond to the maxima of the statist,ic optimized under non(cid:173) linear PCA. However, in order to predict. This is shown for a simple model. Methods of statistical mechanics can be used to find the optima of the objective function of non-linear PCA.


Analysis of Unstandardized Contributions in Cross Connected Networks

Neural Information Processing Systems

Understanding knowledge representations in neural nets has been a difficult problem. Principal components analysis (PCA) of contributions (products of sending activations and connection weights) has yielded valuable insights into knowledge representations, but much of this work has focused on the correlation matrix of contributions. The present work shows that analyzing the variance-covariance matrix of contributions yields more valid insights by taking account of weights.


Recognizing Handwritten Digits Using Mixtures of Linear Models

Neural Information Processing Systems

We construct a mixture of locally linear generative models of a col(cid:173) lection of pixel-based images of digits, and use them for recogni(cid:173) tion. Different models of a given digit are used to capture different styles of writing, and new images are classified by evaluating their log-likelihoods under each model. We use an EM-based algorithm in which the M-step is computationally straightforward principal components analysis (PCA). Incorporating tangent-plane informa(cid:173) tion [12] about expected local deformations only requires adding tangent vectors into the sample covariance matrices for the PCA, and it demonstrably improves performance.


Dynamic Features for Visual Speechreading: A Systematic Comparison

Neural Information Processing Systems

Humans use visual as well as auditory speech signals to recognize spoken words. A variety of systems have been investigated for per(cid:173) forming this task. The main purpose of this research was to sys(cid:173) tematically compare the performance of a range of dynamic visual features on a speechreading task. We have found that normal(cid:173) ization of images to eliminate variation due to translation, scale, and planar rotation yielded substantial improvements in general(cid:173) ization performance regardless of the visual representation used. In addition, the dynamic information in the difference between suc(cid:173) cessive frames yielded better performance than optical-flow based approaches, and compression by local low-pass filtering worked sur(cid:173) prisingly better than global principal components analysis (PCA).


Viewpoint Invariant Face Recognition using Independent Component Analysis and Attractor Networks

Neural Information Processing Systems

We have explored two approaches to recogmzmg faces across changes in pose. First, we developed a representation of face images based on independent component analysis (ICA) and compared it to a principal component analysis (PCA) representation for face recognition. The ICA basis vectors for this data set were more spatially local than the PCA basis vectors and the ICA representa(cid:173) tion had greater invariance to changes in pose. Second, we present a model for the development of viewpoint invariant responses to faces from visual experience in a biological system. The temporal continuity of natural visual experience was incorporated into an attractor network model by Hebbian learning following a lowpass temporal filter on unit activities.