A Learning Algorithm Algorithm 1: Learning algorithm for Dr.k-NN Input: S

Neural Information Processing Systems 

B.1 Proof of Theorem 1 The proof of Theorem 1 is based on the following two lemmas. Moreover, when there is a tie (i.e., the set Proof of Lemma 2. Recall that the Wasserstein metric of order 1 is defined as W ( P,P For the sake of completeness, we extend our algorithm to non-few-training-sample setting. The depth of the shaded area shows the level of samples entropy. The entropy of a sample is defined as follows. As a simple example, for Bernoulli random variable (which can represent, e.g., the outcome for flipping a coin with bias Now we use this entropy to define the "uncertainty" associated with each training points. Figure 6 reveals that the most informative samples usually lie in between categories.