Walking on Minimax Paths for k-NN Search

Kim, Kye-Hyeon (Pohang University of Science and Technology) | Choi, Seungjin (Pohang University of Science and Technology)

AAAI Conferences 

Link-based dissimilarity measures, such as shortest path or Euclidean commute time distance, base their distance on paths between nodes of a weighted graph. These measures are known to be better suited to data manifold with nonconvex-shaped clusters, compared to Euclidean distance, so that k -nearest neighbor (NN) search is improved in such metric spaces. In this paper we present a new link-based dissimilarity measure based on minimax paths between nodes. Two main benefits of minimax path-based dissimilarity measure are: (1) only a subset of paths is considered to make it scalable, while Euclidean commute time distance considers all possible paths; (2) it better captures nonconvex-shaped cluster structure, compared to shortest path distance. We define the total cost assigned to a path between nodes as L p norm of intermediate costs of edges involving the path, showing that minimax path emerges from our L p norm over paths framework. We also define minimax distance as the intermediate cost of the longest edge on the minimax path, then present a greedy algorithm to compute k smallest minimax distances between a query and N data points in O(log N + k log k) time. Numerical experiments demonstrate that our minimax k-NN algorithm reduce the search time by several orders of magnitude, compared to existing methods, while the quality of k -NN search is significantly improved over Euclidean distance.