Goto

Collaborating Authors

 learning nearest neighbor graph


Learning Nearest Neighbor Graphs from Noisy Distance Samples

Neural Information Processing Systems

We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people's preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality. Furthermore, we demonstrate efficiency of our method empirically and theoretically, needing only O(n\log(n)\Delta^{-2}) queries in favorable settings, where \Delta^{-2} accounts for the effect of noise. Using crowd-sourced data collected for a subset of the UT~Zappos50K dataset, we apply our algorithm to learn which shoes people believe are most similar and show that it beats both an active baseline and ordinal embedding.


Reviews: Learning Nearest Neighbor Graphs from Noisy Distance Samples

Neural Information Processing Systems

As a non theory person I was able to follow the motivation, the problem set-up and the main result. The intuition that not all queries Q(j,k) needs to be issued conditioned on the past queries due to the fact that we're in a metric space. The bounds looks fine, mostly it is a construction of the confidence in the estimation of Q(j,k), again, under the constraint that Q(j,k) is a metric space with metric-ish properties. As a piece of technical achievement this paper is just fine. But it can improve in the following sense of story-telling and organisation: 1) we're doing an optimal query problem, namely, querying a noisey oracle Q(j,k) to construct a nearest-neighbor graph G(x_i,x_j).


Reviews: Learning Nearest Neighbor Graphs from Noisy Distance Samples

Neural Information Processing Systems

This paper defines a new learning problem of learning 1-NN graph on a metric space consisting of n points using a noisy distance oracle. Ignoring the metric structure, the problem can be viewed n separate best arm identification problems (for each node find its nearest neighbor by calling the oracle). The best arm identification problem can solved by UCB-like methods that maintain confidence intervals for each distance and has been studied extensively in the literature. The paper proposes an improved algorithm that exploits the triangle inequality and symmetry of the underlying metric in order to improve the confidence intervals. As the paper shows that the improvement doesn't help much for the worst-case metric space.


Learning Nearest Neighbor Graphs from Noisy Distance Samples

Neural Information Processing Systems

We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people's preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality.


Learning Nearest Neighbor Graphs from Noisy Distance Samples

Neural Information Processing Systems

We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people's preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality.