Comparison Between Global Vs Local Normalization of Tweets, and Various Distances
From the text mining literature, it appears that practitioners tend to utilize Cosine Distance to compare 2 documents. They have used it with great success. From our previous blog, we also used Cosine Distance and we also found it extremely good and helping us, and our clustering method, get an insight in the UK Exit Referendum. In here, we decided to change our initial conditions and see if we get different outcomes,i.e. We decided to try 4 others distances: Jaccard, Matching, Rogers Tanimoto and Euclidean.
Dec-18-2016, 02:05:04 GMT