How to Program UMAP from Scratch


And how to improve UMAP. This is the thirteenth article of my column Mathematical Statistics and Machine Learning for Life Sciences where I try to explain some mysterious analytical techniques used in Bioinformatics, Biomedicine, Genetics etc. in a simple way. In the previous post How Exactly UMAP works I started with an intuitive explanation of the math behind UMAP. The best way to learn it is to program UMAP from scratch, this is what we are going to do today. The idea of this post is to show that it is relatively easy for everyone to create their own neighbor graph dimension reduction technique that can provide even better visualization than UMAP. It is going to be lots of coding, buckle up!

