CA-PCA: Manifold Dimension Estimation, Adapted for Curvature

Gilbert, Anna C., O'Neill, Kevin

arXiv.org Machine Learning 

Much of modern data analysis in high dimensions relies on the premise that data, while embedded in a high-dimensional space, lie on or near a submanifold of lower dimension. This allows one to embed the data in a space of lower dimension while preserving much of the essential structure, with benefits including faster computation and data visualization. This lower dimension, hereafter referred to as the intrinsic dimension (ID) of the underlying manifold, often enters as a parameter of the dimension-reduction scheme. For instance, in each of the Johnson-Lindenstrauss-type results for manifolds by [13] and [4] the target dimension depends on the ID. Furthermore, the ID is a parameter of popular dimension reduction methods such as t-SNE [28] and multidimensional scaling [12, 16]. Therefore, it may be beneficial to estimate the ID before running further analysis since compressing the data too much may destroy underlying structure and it may be computationally expensive to re-run algorithms with a new dimension parameter, if such an error is even detectable.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found