AITopics | ann search

Collaborating Authors

ann search

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

e8752f3e51f33a2e06daf044c40ce412-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 14:06:58 GMT

ann search, classifier, nearest neighbor, (12 more...)

Neural Information Processing Systems

Country:

Europe > Finland > Uusimaa > Helsinki (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.30)

Add feedback

Learning-Based Hashing for ANN Search: Foundations and Early Advances

Moran, Sean

arXiv.org Artificial IntelligenceOct-7-2025

Approximate Nearest Neighbour (ANN) search is a fundamental problem in information retrieval, underpinning large-scale applications in computer vision, natural language processing, and cross-modal search. Hashing-based methods provide an efficient solution by mapping high-dimensional data into compact binary codes that enable fast similarity computations in Hamming space. Over the past two decades, a substantial body of work has explored learning to hash, where projection and quantisation functions are optimised from data rather than chosen at random. This article offers a foundational survey of early learning-based hashing methods, with an emphasis on the core ideas that shaped the field. We review supervised, unsupervised, and semi-supervised approaches, highlighting how projection functions are designed to generate meaningful embeddings and how quantisation strategies convert these embeddings into binary codes. We also examine extensions to multi-bit and multi-threshold models, as well as early advances in cross-modal retrieval. Rather than providing an exhaustive account of the most recent methods, our goal is to introduce the conceptual foundations of learning-based hashing for ANN search. By situating these early models in their historical context, we aim to equip readers with a structured understanding of the principles, trade-offs, and open challenges that continue to inform current research in this area.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.04127

Country:

Europe (1.00)
Asia (1.00)
North America > Canada > Ontario (0.28)
North America > United States > California (0.28)

Genre: Overview (1.00)

Industry:

Education > Educational Setting (0.47)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

A Multilabel Classification Framework for Approximate Nearest Neighbor Search

Neural Information Processing SystemsAug-19-2025, 15:29:15 GMT

Approximate nearest neighbor (ANN) search is a fundamental algorithmic problem.

artificial intelligence, classifier, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > Finland > Uusimaa > Helsinki (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.30)

Add feedback

Billion-scale Similarity Search Using a Hybrid Indexing Approach with Advanced Filtering

Emanuilov, Simeon, Dimov, Aleksandar

arXiv.org Artificial IntelligenceJan-23-2025

Similarity search, the task of finding similar vectors, has become a fundamental operation in machine learning, with applications in recommendation engines, semantic search systems, and more [1-3]. As datasets grow to billions of entries, the challenge of performing efficient searches on high-dimensional vectors becomes increasingly complex [4]. This is further compounded by the well-known curse of dimensionality [5], which affects the performance and accuracy of search algorithms as the number of dimensions increases. Approximate Nearest Neighbor (ANN) algorithms, such as Inverted File Index (IVF) [6] and Hierarchical Navigable Small World (HNSW) [7], have been developed to address scalability and performance issues. IVF segments the search space into smaller areas, called Voronoi cells [8], while HNSW constructs a navigable graph structure for efficient search space traversal. Despite their advancements, these methods often struggle to support complex, multi-dimensional filtering efficiently. This is crucial in practical scenarios where additional criteria beyond vector similarity are required to refine search results [6]. Examples of such scenarios include e-commerce product search and semantic search with filtering and recommendation systems.

artificial intelligence, machine learning, vector, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.2478/cait-2024-0035

2501.13442

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.54)

Add feedback

ChatGraph: Chat with Your Graphs

Peng, Yun, Lin, Sen, Chen, Qian, Xu, Lyu, Ren, Xiaojun, Li, Yafei, Xu, Jianliang

arXiv.org Artificial IntelligenceJan-23-2024

Graph analysis is fundamental in real-world applications. Traditional approaches rely on SPARQL-like languages or clicking-and-dragging interfaces to interact with graph data. However, these methods either require users to possess high programming skills or support only a limited range of graph analysis functionalities. To address the limitations, we propose a large language model (LLM)-based framework called ChatGraph. With ChatGraph, users can interact with graphs through natural language, making it easier to use and more flexible than traditional approaches. The core of ChatGraph lies in generating chains of graph analysis APIs based on the understanding of the texts and graphs inputted in the user prompts. To achieve this, ChatGraph consists of three main modules: an API retrieval module that searches for relevant APIs, a graph-aware LLM module that enables the LLM to comprehend graphs, and an API chain-oriented finetuning module that guides the LLM in generating API chains.

api chain, chatgraph, graph, (13 more...)

arXiv.org Artificial Intelligence

2401.12672

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Henan Province > Zhengzhou (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models

Chang, Wei-Cheng, Jiang, Jyun-Yu, Zhang, Jiong, Al-Darabsah, Mutasem, Teo, Choon Hui, Hsieh, Cho-Jui, Yu, Hsiang-Fu, Vishwanathan, S. V. N.

arXiv.org Artificial IntelligenceDec-5-2023

Embedding-based Retrieval Models (ERMs) have emerged as a promising framework for large-scale text retrieval problems due to powerful large language models. Nevertheless, fine-tuning ERMs to reach state-of-the-art results can be expensive due to the extreme scale of data as well as the complexity of multi-stages pipelines (e.g., pre-training, fine-tuning, distillation). In this work, we propose the PEFA framework, namely ParamEter-Free Adapters, for fast tuning of ERMs without any backward pass in the optimization. At index building stage, PEFA equips the ERM with a non-parametric k-nearest neighbor (kNN) component. At inference stage, PEFA performs a convex combination of two scoring functions, one from the ERM and the other from the kNN. Based on the neighborhood definition, PEFA framework induces two realizations, namely PEFA-XL (i.e., extra large) using double ANN indices and PEFA-XS (i.e., extra small) using a single ANN index. Empirically, PEFA achieves significant improvement on two retrieval applications. For document retrieval, regarding Recall@100 metric, PEFA improves not only pre-trained ERMs on Trivia-QA by an average of 13.2%, but also fine-tuned ERMs on NQ-320K by an average of 5.5%, respectively. For product search, PEFA improves the Recall@100 of the fine-tuned ERMs by an average of 5.3% and 14.5%, for PEFA-XS and PEFA-XL, respectively. Our code is available at https://github.com/amzn/pecos/tree/mainline/examples/pefa-wsdm24.

erm, pefa-xl, proceedings, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3616855.3635791

2312.02429

Country:

North America > Mexico > Yucatán > Mérida (0.15)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
(8 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Services (0.34)
Retail > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.68)

Add feedback

kNN-Embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval

El-Kishky, Ahmed, Markovich, Thomas, Leung, Kenny, Portman, Frank, Haghighi, Aria, Xiao, Ying

arXiv.org Artificial IntelligenceAug-5-2023

Candidate retrieval is the first stage in recommendation systems, where a light-weight system is used to retrieve potentially relevant items for an input user. These candidate items are then ranked and pruned in later stages of recommender systems using a more complex ranking model. As the top of the recommendation funnel, it is important to retrieve a high-recall candidate set to feed into downstream ranking models. A common approach is to leverage approximate nearest neighbor (ANN) search from a single dense query embedding; however, this approach this can yield a low-diversity result set with many near duplicates. As users often have multiple interests, candidate retrieval should ideally return a diverse set of candidates reflective of the user's multiple interests. To this end, we introduce kNN-Embed, a general approach to improving diversity in dense ANN-based retrieval. kNN-Embed represents each user as a smoothed mixture over learned item clusters that represent distinct "interests" of the user. By querying each of a user's mixture component in proportion to their mixture weights, we retrieve a high-diversity set of candidates reflecting elements from each of a user's interests. We experimentally compare kNN-Embed to standard ANN candidate retrieval, and show significant improvements in overall recall and improved diversity across three datasets. Accompanying this work, we open source a large Twitter follow-graph dataset (https://huggingface.co/datasets/Twitter/TwitterFollowGraph), to spur further research in graph-mining and representation learning for recommender systems.

artificial intelligence, machine learning, social media, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-33380-4_29

2205.06205

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Cover Tree Compressed Sensing for Fast MR Fingerprint Recovery

Golbabaee, Mohammad, Cheny, Zhouye, Wiauxy, Yves, Davies, Mike E.

arXiv.org Machine LearningJun-23-2017

We adopt data structure in the form of cover trees and iteratively apply approximate nearest neighbour (ANN) searches for fast compressed sensing reconstruction of signals living on discrete smooth manifolds. Levering on the recent stability results for the inexact Iterative Projected Gradient (IPG) algorithm and by using the cover tree's ANN searches, we decrease the projection cost of the IPG algorithm to be logarithmically growing with data population for low dimensional smooth manifolds. We apply our results to quantitative MRI compressed sensing and in particular within the Magnetic Resonance Fingerprinting (MRF) framework. For a similar (or sometimes better) reconstruction accuracy, we report 2-3 orders of magnitude reduction in computations compared to the standard iterative method which uses brute-force searches.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

1706.07834

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
Information Technology > Artificial Intelligence > Machine Learning (0.47)
Information Technology > Information Management > Search (0.46)

Add feedback