A Primer on Topological Data Analysis to Support Image Analysis Tasks in Environmental Science
Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the Sugar, Fish, Flower, and Gravel Dataset produced for the study of mesocale organization of clouds by Rasp et. We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and explore in detail how we can explain this behavior in terms of image-level features.
Jul-25-2022, 20:14:51 GMT