CyberPoint · Blog · Using Compression to Compare Objects
In my previous blog post, I discussed our endeavor to benefit from unsupervised learning on CyberPoint's malware dataset. One of the more intriguing tools I played with during that effort was the normalized compression distance (NCD). It achieves this by approximating the normalized Kolmogorov distance. The Kolmogorov distance between two objects is actually pretty easy to conceptualize -- it is the length of the shortest program that can transform one object into the other. Unlike many popular similarity measures, this provides a universal notion of similarity by quantifying the difference between two objects without restricting the type of difference.
Dec-2-2019, 12:08:29 GMT