Dimensionality Reduction
Supervised Discriminative Sparse PCA with Adaptive Neighbors for Dimensionality Reduction
Shi, Zhenhua, Wu, Dongrui, Huang, Jian, Wang, Yu-Kai, Lin, Chin-Teng
Dimensionality reduction is an important operation in information visualization, feature extraction, clustering, regression, and classification, especially for processing noisy high dimensional data. However, most existing approaches preserve either the global or the local structure of the data, but not both. Approaches that preserve only the global data structure, such as principal component analysis (PCA), are usually sensitive to outliers. Approaches that preserve only the local data structure, such as locality preserving projections, are usually unsupervised (and hence cannot use label information) and uses a fixed similarity graph. We propose a novel linear dimensionality reduction approach, supervised discriminative sparse PCA with adaptive neighbors (SDSPCAAN), to integrate neighborhood-free supervised discriminative sparse PCA and projected clustering with adaptive neighbors. As a result, both global and local data structures, as well as the label information, are used for better dimensionality reduction. Classification experiments on nine high-dimensional datasets validated the effectiveness and robustness of our proposed SDSPCAAN.
Using Dimensionality Reduction to Optimize t-SNE
t-SNE is a popular tool for embedding multi-dimensional datasets into two or three dimensions. However, it has a large computational cost, especially when the input data has many dimensions. Many use t-SNE to embed the output of a neural network, which is generally of much lower dimension than the original data. This limits the use of t-SNE in unsupervised scenarios. We propose using \textit{random} projections to embed high dimensional datasets into relatively few dimensions, and then using t-SNE to obtain a two dimensional embedding. We show that random projections preserve the desirable clustering achieved by t-SNE, while dramatically reducing the runtime of finding the embedding.
Topological Stability: a New Algorithm for Selecting The Nearest Neighbors in Non-Linear Dimensionality Reduction Techniques
Elhenawy, Mohammed, Masoud, Mahmoud, Glaser, Sebastian, Rakotonirainy, Andry
In the machine learning field, dimensionality reduction is an important task. It mitigates the undesired properties of high-dimensional spaces to facilitate classification, compression, and visualization of high-dimensional data. During the last decade, researchers proposed many new (non-linear) techniques for dimensionality reduction. Most of these techniques are based on the intuition that data lies on or near a complex low-dimensional manifold that is embedded in the high-dimensional space. New techniques for dimensionality reduction aim at identifying and extracting the manifold from the high-dimensional space. Isomap is one of widely-used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). The Isomap chooses the nearest neighbours based on the distance only which causes bridges and topological instability. In this paper, we propose a new algorithm to choose the nearest neighbours to reduce the number of short-circuit errors and hence improves the topological stability. Because at any point on the manifold, that point and its nearest neighbours form a vector subspace and the orthogonal to that subspace is orthogonal to all vectors spans the vector subspace. The prposed algorithmuses the point itself and its two nearest neighbours to find the bases of the subspace and the orthogonal to that subspace which belongs to the orthogonal complementary subspace. The proposed algorithm then adds new points to the two nearest neighbours based on the distance and the angle between each new point and the orthogonal to the subspace. The superior performance of the new algorithm in choosing the nearest neighbours is confirmed through experimental work with several datasets.
Graph Convolutional Networks Meet with High Dimensionality Reduction
Recently, Graph Convolutional Networks (GCNs) and their variants have been receiving many research interests for learning graph-related tasks. While the GCNs have been successfully applied to this problem, some caveats inherited from classical deep learning still remain as open research topics in the context of the node classification problem. One such inherited caveat is that GCNs only consider the nodes that are a few propagations away from the labeled nodes to classify them. However, taking only a few propagation steps away nodes into account defeats the purpose of using the graph topological information in the GCNs. To remedy this problem, the-state-of-the-art methods leverage the network diffusion approaches, namely personalized page rank and its variants, to fully account for the graph topology, {\em after} they use the Neural Networks in the GCNs. However, these approaches overlook the fact that the network diffusion methods favour high degree nodes in the graph, resulting in the propagation of labels to unlabeled centralized, hub, nodes. To address this biasing hub nodes problem, in this paper, we propose to utilize a dimensionality reduction technique conjugate with personalized page rank so that we can both take advantage from graph topology and resolve the hub node favouring problem for GCNs. Here, our approach opens a new holistic road for message passing phase of GCNs by suggesting the usage of other proximity matrices instead of well-known Laplacian. Testing on two real-world networks that are commonly used in benchmarking GCNs' performance for the node classification context, we systematically evaluate the performance of the proposed methodology and show that our approach outperforms existing methods for wide ranges of parameter values with very limited deep learning training {\em epochs}.
Dimensionality Reduction 101 for Dummies like Me
Let's starts with the WHY we need to perform Dimensionality Reduction before analyzing data and coming down to some inferences, it is often necessary to visualize the data set, in order to get an idea of it. But, nowadays data sets contain a lot of random variables (also called features) due to which it becomes difficult in visualizing the data set. Sometimes it is even impossible to visualize such high dimensional data as we humans fall astray after we reach a dimension higher than 3. Here is where we come across dimensionality reduction. The process of reducing the number of random variables of the data set under consideration, via obtaining a set of principal variables.
A Multi-view Dimensionality Reduction Algorithm Based on Smooth Representation Model
Over the past few decades, we have witnessed a large family of algorithms that have been designed to provide different solutions to the problem of dimensionality reduction (DR). The DR is an essential tool to excavate the important information from the high-dimensional data by mapping the data to a low-dimensional subspace. Furthermore, for the diversity of varied high-dimensional data, the multi-view features can be utilized for improving the learning performance. However, many DR methods fail to integrating multiple views. Although the features from different views are extracted by different manners, they are utilized to describe the same sample, which implies that they are highly related. Therefore, how to learn the subspace for high-dimensional features by utilizing the consistency and complementary properties of multi-view features is important in the present. In this paper, we propose an effective multi-view dimensionality reduction algorithm named Multi-view Smooth Preserve Projection. Firstly, we construct a single view DR method named Smooth Preserve Projection based on the Smooth Representation model. The proposed method aims to find a subspace for the high-dimensional data, in which the smooth reconstructive weights are preserved as much as possible. Then, we extend it to a multi-view version in which we exploits Hilbert-Schmidt Independence Criterion to jointly learn one common subspace for all views. A plenty of experiments on multi-view datasets show the excellent performance of the proposed method.
TriMap: Large-scale Dimensionality Reduction Using Triplets
Amid, Ehsan, Warmuth, Manfred K.
B M ORE V ISUALIZATIONS We compare the results of TriMap to LargeVis in Figure 7 and 8. We also provide more visualizations obtained using TriMap in Figure 9. C D ISCUSSION We briefly discuss the results of TriMap and draw a comparison to the other methods. TriMap generally provides better global accuracy compared to the competing methods. It also successfully maintains the continuity of the underlying manifold. This can be seen from the COIL-20 result where certain clusters are located farther away from the remaining clusters. However, the underlying structure for the main cluster resembles the one provided by the other methods. TriMap also preserves the continuous structure in the Fashion MNIST and the TV News datasets. TriMap is also efficient in uncovering the possible outliers in the data. For instance, PCA reveals a large number of outliers in the Tabula Muris and the 360 K Lyrics datasets.
Interpretable Discriminative Dimensionality Reduction and Feature Selection on the Manifold
Hosseini, Babak, Hammer, Barbara
Dimensionality reduction (DR) on the manifold includes effective methods which project the data from an implicit relational space onto a vectorial space. Regardless of the achievements in this area, these algorithms suffer from the lack of interpretation of the projection dimensions. Therefore, it is often difficult to explain the physical meaning behind the embedding dimensions. In this research, we propose the interpretable kernel DR algorithm (I-KDR) as a new algorithm which maps the data from the feature space to a lower dimensional space where the classes are more condensed with less overlapping. Besides, the algorithm creates the dimensions upon local contributions of the data samples, which makes it easier to interpret them by class labels. Additionally, we efficiently fuse the DR with feature selection task to select the most relevant features of the original space to the discriminative objective. Based on the empirical evidence, I-KDR provides better interpretations for embedding dimensions as well as higher discriminative performance in the embedded space compared to the state-of-the-art and popular DR algorithms.
Laplacian Matrix for Dimensionality Reduction and Clustering
Wiskott, Laurenz, Schönfeld, Fabian
Many problems in machine learning can be expressed by means of a graph with nodes representing training samples and edges representing the relationship between samples in terms of similarity, temporal proximity, or label information. Graphs can in turn be represented by matrices. A special example is the Laplacian matrix, which allows us to assign each node a value that varies only little between strongly connected nodes and more between distant nodes. Such an assignment can be used to extract a useful feature representation, find a good embedding of data in a low dimensional space, or perform clustering on the original samples. In these lecture notes we first introduce the Laplacian matrix and then present a small number of algorithms designed around it.
Comprehensive Guide to 12 Dimensionality Reduction Techniques
Have you ever worked on a dataset with more than a thousand features? I have, and let me tell you it's a very challenging task, especially if you don't know where to start! Having a high number of variables is both a boon and a curse. It's great that we have loads of data for analysis, but it is challenging due to size. It's not feasible to analyze each and every variable at a microscopic level. It might take us days or months to perform any meaningful analysis and we'll lose a ton of time and money for our business! Not to mention the amount of computational power this will take. We need a better way to deal with high dimensional data so that we can quickly extract patterns and insights from it. So how do we approach such a dataset?