AITopics | diskann

Collaborating Authors

diskann

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations

Neural Information Processing SystemsApr-29-2026, 20:34:58 GMT

Graph-based approaches to nearest neighbor search are popular and powerful tools for handling large datasets in practice, but they have limited theoretical guarantees. We study the worst-case performance of recent graph-based approximate nearest neighbor search algorithms, such as HNSW, NSG and DiskANN. For DiskANN, we show that its "slow preprocessing" version provably supports approximate nearest neighbor search query with constant approximation ratio and poly-logarithmic query time, on data sets with bounded "intrinsic" dimension. For the other data structure variants studied, including DiskANN with "fast preprocessing", HNSW and NSG, we present a family of instances on which the empirical query time required to achieve a "reasonable" accuracy is linear in instance size. For example, for DiskANN, we show that the query procedure can take at least 0.1n steps on instances of size nbefore it encounters any of the 5nearest neighbors of the query.

artificial intelligence, information retrieval, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, Rohan Kadekodi

Neural Information Processing SystemsFeb-11-2026, 10:06:17 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, dataset, graph, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > India (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.65)

Add feedback

Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations

Neural Information Processing SystemsDec-26-2025, 20:16:13 GMT

Graph-based approaches to nearest neighbor search are popular and powerful tools for handling large datasets in practice, but they have limited theoretical guarantees. We study the worst-case performance of recent graph-based approximate nearest neighbor search algorithms, such as HNSW, NSG and DiskANN. For DiskANN, we show that its slow preprocessing'' version provably supports approximate nearest neighbor search query with constant approximation ratio and poly-logarithmic query time, on data sets with bounded intrinsic'' dimension. For the other data structure variants studied, including DiskANN with fast preprocessing'', HNSW and NSG, we present a family of instances on which the empirical query time required to achieve a reasonable'' accuracy is linear in instance size. For example, for DiskANN, we show that the query procedure can take at least $0.1 n$ steps on instances of size $n$ before it encounters any of the $5$ nearest neighbors of the query.

approximate nearest neighbor search implementation, name change, worst-case performance, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Neural Information Processing SystemsDec-24-2025, 23:58:14 GMT

Current state-of-the-art approximate nearest neighbor search (ANNS) algorithms generate indices that must be stored in main memory for fast high-recall search. This makes them expensive and limits the size of the dataset. We present a new graph-based indexing and search system called DiskANN that can index, store, and search a billion point database on a single workstation with just 64GB RAM and an inexpensive solid-state drive (SSD). Contrary to current wisdom, we demonstrate that the SSD-based indices built by DiskANN can meet all three desiderata for large-scale ANNS: high-recall, low query latency and high density (points indexed per node). On the billion point SIFT1B bigann dataset, DiskANN serves > 5000 queries a second with < 3ms mean latency and 95%+ 1-recall@1 on a 16 core machine, where state-of-the-art billion-point ANNS algorithms with similar memory footprint like FAISS and IVFOADC+G+P plateau at around 50% 1-recall@1. Alternately, in the high recall regime, DiskANN can index and serve 5 10x more points per node compared to state-of-the-art graph-based methods such as HNSW and NSG. Finally, as part of our overall DiskANN system, we introduce Vamana, a new graph-based ANNS index that is more versatile than the graph indices even for in-memory indices.

accurate billion-point nearest neighbor search, diskann, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.65)

Add feedback

Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnawamy, Rohan Kadekodi

Neural Information Processing SystemsOct-2-2025, 00:52:49 GMT

This makes them expensive and limits the size of the dataset.

information retrieval, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.65)

Add feedback

MINT: Multi-Vector Search Index Tuning

Zhu, Jiongli, Wang, Yue, Ding, Bailu, Bernstein, Philip A., Narasayya, Vivek, Chaudhuri, Surajit

arXiv.org Artificial IntelligenceApr-29-2025

Vector search plays a crucial role in many real-world applications. In addition to single-vector search, multi-vector search becomes important for multi-modal and multi-feature scenarios today. In a multi-vector database, each row is an item, each column represents a feature of items, and each cell is a high-dimensional vector. In multi-vector databases, the choice of indexes can have a significant impact on performance. Although index tuning for relational databases has been extensively studied, index tuning for multi-vector search remains unclear and challenging. In this paper, we define multi-vector search index tuning and propose a framework to solve it. Specifically, given a multi-vector search workload, we develop algorithms to find indexes that minimize latency and meet storage and recall constraints. Compared to the baseline, our latency achieves 2.1X to 8.3X speedup.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.20018

Country:

Asia (0.93)
North America > United States > California (0.67)
Europe (0.67)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Review for NeurIPS paper: HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory

Neural Information Processing SystemsFeb-11-2025, 22:47:07 GMT

The paper attempts to scale nearest neighbor search using heterogenous memory hardware. In this regard, authors devised a practical trick on top of HNSW. It is a clean node promotion strategy along the memory hierarchy using the degree information. The method was evaluated on some common large datasets, but not necessarily difficult ones. Reviewers found the setup to leverage the memory hierarchy interesting and the benefits obtained from it appears promising.

efficient billion-point nearest neighbor search, heterogeneous memory, memory hierarchy, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.66)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)

Add feedback

Reviews: DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Neural Information Processing SystemsJan-21-2025, 10:31:37 GMT

The writing could be improved, but it's in general understandable. However, citation quality can be improved. In particular, it seems to me that NSG and HNSW are actually using the same pruning rule (which results in approximate relative neighborhood graph). I really like your updated version, which reduces the number hops (and I haven't seen this pruning variant before)! Detailed comments: Abstract and further: base points sounds like a strange term, do you mean domain points? Please, find a more specific-generic citation that describes this phenomena.

accurate billion-point nearest neighbor search, relative neighborhood graph, single node, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.44)

Add feedback

Reviews: DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Neural Information Processing SystemsJan-21-2025, 10:31:26 GMT

In post rebuttal discussions, reviewers concurred in subsequent discussions that the paper presents solid state of art implementation and very impressive results, which will have good impact for practitioners. This significant impact by itself was worthy of publication.

accurate billion-point nearest neighbor search, diskann, single node

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Filters

Collaborating Authors

diskann

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations

d0ac28b79816b51124fcc804b2496a36-Paper-Conference.pdf

Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations

DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

MINT: Multi-Vector Search Index Tuning

Review for NeurIPS paper: HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory

Reviews: DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Reviews: DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node