Randomized Algorithms for Comparison-based Search
Tschopp, Dominique, Diggavi, Suhas, Delgosha, Payam, Mohajer, Soheil
–Neural Information Processing Systems
This paper addresses the problem of finding the nearest neighbor (or one of the $R$-nearest neighbors) of a query object $q$ in a database of $n$ objects, when we can only use a comparison oracle. The comparison oracle, given two reference objects and a query object, returns the reference object most similar to the query object. The main problem we study is how to search the database for the nearest neighbor (NN) of a query, while minimizing the questions. The difficulty of this problem depends on properties of the underlying database. We show the importance of a characterization: \emph{combinatorial disorder} $D$ which defines approximate triangle inequalities on ranks. We present a lower bound of $\Omega(D\log \frac{n}{D}+D^2)$ average number of questions in the search phase for any randomized algorithm, which demonstrates the fundamental role of $D$ for worst case behavior. We develop a randomized scheme for NN retrieval in $O(D^3\log^2 n+ D\log^2 n \log\log n^{D^3})$ questions. The learning requires asking $O(n D^3\log^2 n+ D \log^2 n \log\log n^{D^3})$ questions and $O(n\log^2n/\log(2D))$ bits to store.
Neural Information Processing Systems
Dec-31-2011
- Country:
- Europe > Switzerland (0.28)
- North America > United States
- California > Los Angeles County > Los Angeles (0.14)
- Technology: