Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node
Subramanya, Suhas Jayaram, Devvrit, Fnu, Simhadri, Harsha Vardhan, Krishnawamy, Ravishankar, Kadekodi, Rohan
–Neural Information Processing Systems
Current state-of-the-art approximate nearest neighbor search (ANNS) algorithms generate indices that must be stored in main memory for fast high-recall search. This makes them expensive and limits the size of the dataset. We present a new graph-based indexing and search system called DiskANN that can index, store, and search a billion point database on a single workstation with just 64GB RAM and an inexpensive solid-state drive (SSD). Contrary to current wisdom, we demonstrate that the SSD-based indices built by DiskANN can meet all three desiderata for large-scale ANNS: high-recall, low query latency and high density (points indexed per node). On the billion point SIFT1B bigann dataset, DiskANN serves 5000 queries a second with 3ms mean latency and 95% 1-recall@1 on a 16 core machine, where state-of-the-art billion-point ANNS algorithms with similar memory footprint like FAISS and IVFOADC G P plateau at around 50% 1-recall@1.
Neural Information Processing Systems
Mar-19-2020, 02:17:00 GMT
- Technology: