Using statistical physicstechniques including the Gibbs distribution, binary decision fields and effective energies, we propose self-organizing PCA rules which are capable of resisting outliers while fulfilling various PCA-related tasks such as obtaining the first principal component vector,the first k principal component vectors, and directly finding the subspace spanned by the first k vector principal component vectorswithout solving for each vector individually. Comparative experimentshave shown that the proposed robust rules improve the performances of the existing PCA algorithms significantly whenoutliers are present.
Suppose certain data points are overly contaminated, then the existing principal component analysis (PCA) methods are frequently incapable of filtering out and eliminating the excessively polluted ones, which potentially lead to the functional degeneration of the corresponding models. To tackle the issue, we propose a general framework namely robust weight learning with adaptive neighbors (RWL-AN), via which adaptive weight vector is automatically obtained with both robustness and sparse neighbors. More significantly, the degree of the sparsity is steerable such that only exact k well-fitting samples with least reconstruction errors are activated during the optimization, while the residual samples, i.e., the extreme noised ones are eliminated for the global robustness. Additionally, the framework is further applied to PCA problem to demonstrate the superiority and effectiveness of the proposed RWL-AN model. Papers published at the Neural Information Processing Systems Conference.
This paper extends robust principal component analysis (RPCA) to nonlinear manifolds. Suppose that the observed data matrix is the sum of a sparse component and a component drawn from some low dimensional manifold. Is it possible to separate them by using similar ideas as RPCA? Is there any benefit in treating the manifold as a whole as opposed to treating each local region independently? We answer these two questions affirmatively by proposing and analyzing an optimization framework that separates the sparse component from the manifold under noisy data.
The Slow Feature Analysis (SFA) unsupervised learning framework extracts features representing the underlying causes of the changes within a temporally coherent high-dimensional raw sensory input signal. We develop the first online version of SFA, via a combination of incremental Principal Components Analysis and Minor Components Analysis. Unlike standard batch-based SFA, online SFA adapts along with non-stationary environments, which makes it a generally useful unsupervised preprocessor for autonomous learning agents. We compare online SFA to batch SFA in several experiments and show that it indeed learns without a teacher to encode the input stream by informative slow features representing meaningful abstract environmental properties. We extend online SFA to deep networks in hierarchical fashion, and use them to successfully extract abstract object position information from high-dimensional video.
What if our data have way more than 3-dimensions? In the table is the average consumption of 17 types of food in grams per person per week for every country in the UK. The table shows some interesting variations across different food types, but overall differences aren't so notable. Let's see if PCA can eliminate dimensions to emphasize how countries differ. Already we can see something is different about Northern Ireland.