An upper bound on prototype set size for condensed nearest neighbor
The nearest neighbor (NN) rule assigns to an unclassified point the class of a closest point from a set of prototypical points. The NN algorithm stores every training point as a prototypical point and classifies new points according to the NN rule. A nice property is that, for arbitrary class distributions, as the number of training points goes to infinity, the error of the rule produced by the NN algorithm converges to within twice the Bayes error [4]. Unfortunately, storing every training point as a prototypical point can be impractical for huge training sets in terms of both memory complexity and the time complexity of classifying according to the NN rule. As a result, many techniques exist for reducing the size of the set of prototypical points.
Sep-29-2013