Supervised Dimensionality Reduction and Visualization using Centroid-encoder
Ghosh, Tomojit, Kirby, Michael
Visualizing high-dimensional data is an essential task in Data Science and Machine Learning. The Centroid-Encoder (CE) method is similar to the autoencoder but incorporates label information to keep objects of a class close together in the reduced visualization space. CE exploits nonlinearity and labels to encode high variance in low dimensions while capturing the global structure of the data. We present a detailed analysis of the method using a wide variety of data sets and compare it with other supervised dimension reduction techniques, including NCA, nonlinear NCA, t-distributed NCA, t-distributed MCML, supervised UMAP, supervised PCA, Colored Maximum Variance Unfolding, supervised Isomap, Parametric Embedding, supervised Neighbor Retrieval Visualizer, and Multiple Relational Embedding. We empirically show that centroid-encoder outperforms most of these techniques. We also show that when the data variance is spread across multiple modalities, centroid-encoder extracts a significant amount of information from the data in low dimensional space. This key feature establishes its value to use it as a tool for data visualization.
Feb-28-2020
- Country:
- North America
- United States
- New York (0.04)
- Rhode Island > Providence County
- Providence (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Colorado > Larimer County
- Fort Collins (0.04)
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe > Czechia
- Prague (0.04)
- North America
- Genre:
- Research Report (0.64)
- Technology: