Goto

Collaborating Authors

 Dimensionality Reduction


Discriminative Unsupervised Dimensionality Reduction

AAAI Conferences

As an important machine learning topic, dimensionality reduction has been widely studied and utilized in various kinds of areas. A multitude of dimensionality reduction methods have been developed, among which unsupervised dimensionality reduction is more desirable when obtaining label information requires onerous work. However, most previous unsupervised dimensionality reduction methods call for an affinity graph constructed beforehand, with which the following dimensionality reduction steps can be then performed. Separation of graph construction and dimensionality reduction leads the dimensionality reduction process highly dependent on quality of the input graph. In this paper, we propose a novel graph embedding method for unsupervised dimensionality reduction. We simultaneously conduct dimensionality reduction along with graph construction by assigning adaptive and optimal neighbors according to the projected local distances. Our method doesn’t need an affinity graph constructed in advance, but instead learns the graph concurrently with dimensionality reduction. Thus, the learned graph is optimal for dimensionality reduction. Meanwhile, our learned graph has an explicit block diagonal structure, from which the clustering results could be directly revealed without any postprocessing steps. Extensive empirical results on dimensionality reduction as well as clustering are presented to corroborate the performance of our method.


Compressed Spectral Regression for Efficient Nonlinear Dimensionality Reduction

AAAI Conferences

Spectral dimensionality reduction methods have recently emerged as powerful tools for various applications in pattern recognition, data mining and computer vision. These methods use information contained in the eigenvectors of a data affinity (i.e, item-item similarity) matrix to reveal the low dimensional structure of the high dimensional data. One of the limitations of various spectral dimensionality reduction methods is their high computational complexity. They all need to construct a data affinity matrix and compute the top eigenvectors. This leads to O(n2) computational complexity, where n is the number of samples. Moreover, when the data are highly non-linear distributed, some linear methods have to be performed in a reproducing kernel Hilbert space (leads to the corresponding kernel methods) to learn an effective non-linear mapping. The computational complexity of these kernel methods is O(n3). In this paper, we propose a novel nonlinear dimensionality reduction algorithm, called Compressed Spectral Regression, with O(n) computational complexity. Extensive experiments on data clustering demonstrate the effectiveness and efficiency of the proposed approach.


Adaptive Metric Dimensionality Reduction

arXiv.org Machine Learning

Linear classifiers play a central role in supervised learning, with a rich and elegant theory. This setting assumes data is represented as points in a Hilbert space, either explicitly as feature vectors or implicitly via a kernel. A significant strength of the Hilbert-space model is its inner-product structure, which has been exploited statistically and algorithmically by sophisticated techniques from geometric and functional analysis, placing the celebrated hyperplane methods on a solid foundation. However, the success of the Hilbert-space model obscures its limitations -- perhaps the most significant of which is that it cannot represent many norms and distance functions that arise naturally in applications.


IT-map: an Effective Nonlinear Dimensionality Reduction Method for Interactive Clustering

arXiv.org Machine Learning

In our previous works (1, 2), we have shown its potential in cluster analysis. Combinations of the IT structure with the Semi-Supervised learning concept (3), Rodriguez and Laio's "Decision Graph" (4), and Frey and Dueck's "Affinity Propagation" (AP) (5), have resulted in effective cluster analysis methods. For example, based on the IT structure, the application scope of AP was extended from spherical to nonspherical cluster detection (2). In this paper, we will show another potential of the IT structure: nonlinear dimensionality reduction, for which an effective combination is made with the "isometric mapping" (Isomap) proposed by Tenenbaum et al (6). Isomap is a simple and effective dimensionality reduction method which extends the application scope of multidimensional scaling (MDS) from linear to nonlinear structure. It contains three steps: first construct the K-nearest-neighborhood (KNN) graph, then compute the graph distances (the shortest path distances in the neighborhood graph) and lastly compute the low-dimensional embedding by classical MDS. In effect, the constructed KNN graph for data points is unfolded in the low-dimensional Euclidean space, which is effective especially for preserving in the embedding the topology relationship of data points on manifolds. The crux of the success for Isomap is that it takes as the input for classical MDS the graph distances, instead of the straight-line Euclidian ones, for all pairs of data points.


Dimensionality Reduction via Program Induction

AAAI Conferences

How can techniques drawn from machine learning be appliedto the learning of structured, compositional representations? In this work, we adopt functional programs as our representation, and cast the problem of learning symbolic representations as a symbolic analog of dimensionality reduction. By placing program synthesis within a probabilistic machinelearning framework, we are able to model the learning ofsome English inflectional morphology and solve a set of synthetic regression problems.


On Convolutional Approximations to Linear Dimensionality Reduction Operators for Large Scale Data Processing

arXiv.org Machine Learning

In this paper, we examine the problem of approximating a general linear dimensionality reduction (LDR) operator, represented as a matrix $A \in \mathbb{R}^{m \times n}$ with $m < n$, by a partial circulant matrix with rows related by circular shifts. Partial circulant matrices admit fast implementations via Fourier transform methods and subsampling operations; our investigation here is motivated by a desire to leverage these potential computational improvements in large-scale data processing tasks. We establish a fundamental result, that most large LDR matrices (whose row spaces are uniformly distributed) in fact cannot be well approximated by partial circulant matrices. Then, we propose a natural generalization of the partial circulant approximation framework that entails approximating the range space of a given LDR operator $A$ over a restricted domain of inputs, using a matrix formed as a product of a partial circulant matrix having $m '> m$ rows and a $m \times k$ 'post processing' matrix. We introduce a novel algorithmic technique, based on sparse matrix factorization, for identifying the factors comprising such approximations, and provide preliminary evidence to demonstrate the potential of this approach.


Dimensionality Reduction with Subspace Structure Preservation

Neural Information Processing Systems

Modeling data as being sampled from a union of independent subspaces has been widely applied to a number of real world applications. However, dimensionality reduction approaches that theoretically preserve this independence assumption have not been well studied. Our key contribution is to show that $2K$ projection vectors are sufficient for the independence preservation of any $K$ class data sampled from a union of independent subspaces. It is this non-trivial observation that we use for designing our dimensionality reduction technique. In this paper, we propose a novel dimensionality reduction algorithm that theoretically preserves this structure for a given dataset. We support our theoretical analysis with empirical results on both synthetic and real world data achieving \textit{state-of-the-art} results compared to popular dimensionality reduction techniques.


Cortical spatio-temporal dimensionality reduction for visual grouping

arXiv.org Machine Learning

The visual systems of many mammals, including humans, is able to integrate the geometric information of visual stimuli and to perform cognitive tasks already at the first stages of the cortical processing. This is thought to be the result of a combination of mechanisms, which include feature extraction at single cell level and geometric processing by means of cells connectivity. We present a geometric model of such connectivities in the space of detected features associated to spatio-temporal visual stimuli, and show how they can be used to obtain low-level object segmentation. The main idea is that of defining a spectral clustering procedure with anisotropic affinities over datasets consisting of embeddings of the visual stimuli into higher dimensional spaces. Neural plausibility of the proposed arguments will be discussed.


The Role of Dimensionality Reduction in Classification

AAAI Conferences

Dimensionality reduction (DR) is often used as a preprocessing step in classification, but usually one first fixes the DR mapping, possibly using label information, and then learns a classifier (a filter approach). Best performance would be obtained by optimizing the classification error jointly over DR mapping and classifier (a wrapper approach), but this is a difficult nonconvex problem, particularly with nonlinear DR. Using the method of auxiliary coordinates, we give a simple, efficient algorithm to train a combination of nonlinear DR and a classifier, and apply it to a RBF mapping with a linear SVM. This alternates steps where we train the RBF mapping and a linear SVM as usual regression and classification, respectively, with a closed-form step that coordinates both. The resulting nonlinear low-dimensional classifier achieves classification errors competitive with the state-of-the-art but is fast at training and testing, and allows the user to trade off runtime for classification accuracy easily. We then study the role of nonlinear DR in linear classification, and the interplay between the DR mapping, the number of latent dimensions and the number of classes. When trained jointly, the DR mapping takes an extreme role in eliminating variation: it tends to collapse classes in latent space, erasing all manifold structure, and lay out class centroids so they are linearly separable with maximum margin.


Convex Optimization Learning of Faithful Euclidean Distance Representations in Nonlinear Dimensionality Reduction

arXiv.org Machine Learning

Classical multidimensional scaling only works well when the noisy distances observed in a high dimensional space can be faithfully represented by Euclidean distances in a low dimensional space. Advanced models such as Maximum Variance Unfolding (MVU) and Minimum Volume Embedding (MVE) use Semi-Definite Programming (SDP) to reconstruct such faithful representations. While those SDP models are capable of producing high quality configuration numerically, they suffer two major drawbacks. One is that there exist no theoretically guaranteed bounds on the quality of the configuration. The other is that they are slow in computation when the data points are beyond moderate size. In this paper, we propose a convex optimization model of Euclidean distance matrices. We establish a non-asymptotic error bound for the random graph model with sub-Gaussian noise, and prove that our model produces a matrix estimator of high accuracy when the order of the uniform sample size is roughly the degree of freedom of a low-rank matrix up to a logarithmic factor. Our results partially explain why MVU and MVE often work well. Moreover, we develop a fast inexact accelerated proximal gradient method. Numerical experiments show that the model can produce configurations of high quality on large data points that the SDP approach would struggle to cope with.