Panorama: Fast-Track Nearest Neighbors

Ramani, Vansh, Schlomer, Alexis, Nayar, Akash, Ranu, Sayan, Patel, Jignesh M., Karras, Panagiotis

Oct-27-2025–arXiv.org Artificial Intelligence

Approximate Nearest-Neighbor Search (ANNS) efficiently finds data items whose embeddings are close to that of a given query in a high-dimensional space, aiming to balance accuracy with speed. Used in recommendation systems, image and video retrieval, natural language processing, and retrieval-augmented generation (RAG), ANNS algorithms such as IVFPQ, HNSW graphs, Annoy, and MRPT utilize graph, tree, clustering, and quantization techniques to navigate large vector spaces. Despite this progress, ANNS systems spend up to 99% of query time to compute distances in their final refinement phase. Such transforms compact over 90% of signal energy into the first half of dimensions, enabling early candidate pruning with partial distance computations. Experiments across diverse datasets--from image-based CIFAR-10 and GIST to modern embedding spaces including OpenAI's Ada 2 and Large 3--demonstrate that P The proliferation of large-scale neural embeddings has transformed machine learning applications, from computer vision and recommendation systems (Lowe, 2004; Koren et al., 2009) to bioinformat-ics (Altschul et al., 1990) and modern retrieval-augmented generation (RAG) systems (Lewis et al., 2020; Gao et al., 2023). As embedding models evolve from hundreds to thousands of dimensions-- exemplified by OpenAI's text-embedding-3-large (Neelakantan et al., 2022)--the demand for efficient and scalable real-time Approximate Nearest-Neighbor Search (ANNS) intensifies.Figure 1: Common ANNS operations on vector databases. Current ANNS methods fall into four major categories: graph-based, clustering-based, tree-based, and hash-based. Tree-based methods, including kd-trees (Bentley, 1975) and FLANN (Muja & Lowe, 2014), recursively divide the space but degrade in high dimensions due to the curse of dimensionality. Finally, hash-based methods, such as LSH (Indyk & Motwani, 1998; Andoni & Indyk, 2006) and multi-probe LSH (Lv et al., 2007), map points into buckets so that similar points are likely to collide. Despite this diversity, all such methods operate in two phases (Babenko & Lempitsky, 2016): filtering and refinement (or verification).

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

Oct-27-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.46)
- North America > United States
  - California (0.28)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.94)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning
      - Generative AI (0.44)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found