k-nearest neighbor algorithm using Python
The example used to illustrate the method in the source code is the famous iris data set, consisting of 3 clusters, 150 observations, and 4 variables, first analysed in 1936. How does the methodology perform on large data sets with many variables, or on unstructured data? Why was Python chosen to do this analysis? I think this is great, but I would be interested to know the motivation. The author mentioned other clustering techniques, such as SVM, Naive Bayes (issued from statistical science) or neural networks.
Nov-30-2016, 03:30:03 GMT