Balanced k-Nearest Neighbors
Cook, Brian (University of Texas at Arlington) | Huber, Manfred (University of Texas at Arlington)
Classic k-Nearest Neighbor (kNN) algorithms approximate a regression or classification function at a query point based on the k-nearest training observations. In real-world datasets, however, the set of k neighbors is frequently not uniformly distributed around a given query point. This can result in a locally biased estimate and thus in degraded regression or classification results. This paper presents two new kNN algorithms that adjust the weight of the k-nearest neighbors to achieve a more balanced distribution. Experiments on real-world datasets and a range of synthetic training distributions and noise levels identify conditions under which the algorithms can improve accuracy with minimal increase in computation time.
May-15-2019