A Topological "Reading" Lesson: Classification of MNIST using TDA
Garin, Adélie, Tauzin, Guillaume
--We present a way to use T opological Data Analysis (TDA) for machine learning tasks on grayscale images. We apply persistent homology to generate a wide range of topological features using a point cloud obtained from an image, its natural grayscale filtration, and different filtrations defined on the binarized image. We show that this topological machine learning pipeline can be used as a highly relevant dimensionality reduction by applying it to the MNIST digits dataset. We conduct a feature selection and study their correlations while providing an intuitive interpretation of their importance, which is relevant in both machine learning and TDA. Finally, we show that we can classify digit images while reducing the size of the feature set by a factor 5 compared to the grayscale pixel value features and maintain similar accuracy. I NTRODUCTION Topological Data Analysis (TDA) [1] applies techniques from algebraic topology to study and extract topological and geometric information on the shape of data. In this paper, we use persistent homology [2], a tool from TDA that extracts features representing the numbers of connected components, cycles, and voids and their birth and death during an iterative process called a filtration. Each of those features is summarized as a point in a persistence diagram .
Oct-22-2019
- Country:
- Genre:
- Research Report (0.40)
- Technology: