Goto

Collaborating Authors

Topological Stability: a New Algorithm for Selecting The Nearest Neighbors in Non-Linear Dimensionality Reduction Techniques

arXiv.org Machine Learning

In the machine learning field, dimensionality reduction is an important task. It mitigates the undesired properties of high-dimensional spaces to facilitate classification, compression, and visualization of high-dimensional data. During the last decade, researchers proposed many new (non-linear) techniques for dimensionality reduction. Most of these techniques are based on the intuition that data lies on or near a complex low-dimensional manifold that is embedded in the high-dimensional space. New techniques for dimensionality reduction aim at identifying and extracting the manifold from the high-dimensional space. Isomap is one of widely-used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). The Isomap chooses the nearest neighbours based on the distance only which causes bridges and topological instability. In this paper, we propose a new algorithm to choose the nearest neighbours to reduce the number of short-circuit errors and hence improves the topological stability. Because at any point on the manifold, that point and its nearest neighbours form a vector subspace and the orthogonal to that subspace is orthogonal to all vectors spans the vector subspace. The prposed algorithmuses the point itself and its two nearest neighbours to find the bases of the subspace and the orthogonal to that subspace which belongs to the orthogonal complementary subspace. The proposed algorithm then adds new points to the two nearest neighbours based on the distance and the angle between each new point and the orthogonal to the subspace. The superior performance of the new algorithm in choosing the nearest neighbours is confirmed through experimental work with several datasets.


Step-by-Step Signal Processing with Machine Learning: Manifold Learning

#artificialintelligence

In my first article on signal processing using machine learning, I introduced Principal Component Analysis (PCA) and Independent Component Analysis (ICA) for dimensionality reduction. We were able to see how these methods can be used to reduce the number of features in our data. However, they are linear methods: they do not always perform well when there are nonlinear relationships within our data. This is where manifold learning comes in. A manifold is any space that is locally Euclidean.


Error Metrics for Learning Reliable Manifolds from Streaming Data

arXiv.org Machine Learning

Spectral dimensionality reduction is frequently used to identify low-dimensional structure in high-dimensional data. However, learning manifolds, especially from the streaming data, is computationally and memory expensive. In this paper, we argue that a stable manifold can be learned using only a fraction of the stream, and the remaining stream can be mapped to the manifold in a significantly less costly manner. Identifying the transition point at which the manifold is stable is the key step. We present error metrics that allow us to identify the transition point for a given stream by quantitatively assessing the quality of a manifold learned using Isomap. We further propose an efficient mapping algorithm, called S-Isomap, that can be used to map new samples onto the stable manifold. We describe experiments on a variety of data sets that show that the proposed approach is computationally efficient without sacrificing accuracy.


A Nonlinear Dimensionality Reduction Framework Using Smooth Geodesics

arXiv.org Machine Learning

Existing dimensionality reduction methods are adept at revealing hidden underlying manifolds arising from high-dimensional data and thereby producing a low-dimensional representation. However, the smoothness of the manifolds produced by classic techniques in the presence of noise is not guaranteed. In fact, the embedding generated using such non-smooth, noisy measurements may distort the geometry of the manifold and thereby produce an unfaithful embedding. Herein, we propose a framework for nonlinear dimensionality reduction that generates a manifold in terms of smooth geodesics that is designed to treat problems in which manifold measurements have been corrupted by noise. Our method generates a network structure for given high-dimensional data using a neighborhood search and then produces piecewise linear shortest paths that are defined as geodesics. Then, we fit points in each geodesic by a smoothing spline to emphasize the smoothness. The robustness of this approach for noisy and sparse datasets is demonstrated by the implementation of the method on synthetic and real-world datasets.


Low-Rank Isomap Algorithm

arXiv.org Machine Learning

Noname manuscript No. (Will be inserted by the editor) Abstract The Isomap is a well-known nonlinear dimensionality reduction method that highly suffers from computational complexity. Its computational complexity mainly arises from two stages; a) embedding a full graph on the data in the ambient space, and b) a complete eigenvalue decomposition. Although the reduction of computational complexity of the graph stage has been investigated, yet the eigenvalue decomposition stage remains a bottleneck in the problem. In this paper, we propose the Low-Rank Isomap algorithm by introducing a projection operator on the embedded graph from the ambient space to a lowrank latent space to facilitate applying the partial eigenvalue decomposition. This approach leads to reducing the complexity of Isomap to a linear order while preserving the structural information during the dimensionality reduction process. The superiority of the Low-Rank Isomap algorithm compared to some state-of-art algorithms is experimentally verified on facial image clustering in terms of speed and accuracy.