Practical Applications of Locality Sensitive Hashing for Unstructured Data

@machinelearnbot 

The purpose of this article is to demonstrate how the practical Data Scientist can implement a Locality Sensitive Hashing system from start to finish in order to drastically reduce the search time typically required in high dimensional spaces when finding similar items. Locality Sensitive Hashing accomplishes this efficiency by exponentially reducing the amount of data required for storage when collecting features for comparison between similar item sets. In other words, Locality Sensitive Hashing successfully reduces a high dimensional feature space while still retaining a random permutation of relevant features which research has shown can be used between data sets to determine an accurate approximation of Jaccard similarity [2,3]. The concept of Locality Sensitive Hashing has been around for some time now with publications dating back as far as 1999 [1] exploring its use for breaking the curse of dimensionality in nearest neighbor query problems. Since this time various applications of Locality Sensitive Hashing have been making appearances in academic publications all over the world.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found