AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Bing is the anti-AI search engine you should be using

PCWorldApr-21-2026, 11:00:00 GMT

PCWorld argues that Bing serves as a superior alternative to AI-heavy search engines by prioritizing human-authored content over automated summaries. AI search engines like Google's AI Mode often hide original sources and provide misleading information, with traffic to publishers dropping significantly.

information retrieval, machine learning, natural language, (14 more...)

PCWorld

Country: North America > United States > California (0.04)

Industry:

Information Technology > Security & Privacy (0.74)
Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Information Management > Search (0.96)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Stochastic Structured Prediction under Bandit Feedback

Artem Sokolov, Julia Kreutzer, Stefan Riezler, Christopher Lo

Neural Information Processing SystemsMar-23-2026, 13:06:07 GMT

Neural Information Processing Systems http://nips.cc/

information retrieval, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.14)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
(2 more...)

Add feedback

An algorithm for L1 nearest neighbor search via monotonic embedding

Xinan Wang, Sanjoy Dasgupta

Neural Information Processing SystemsMar-23-2026, 08:16:12 GMT

Fast algorithms for nearest neighbor (NN) search have in large part focused on 2 distance. Here we develop an approach for 1 distance that begins with an explicit and exactly distance-preserving embedding of the points into 22. We show how this can efficiently be combined with random-projection based methods for 2 NN search, such as locality-sensitive hashing (LSH) or random projection trees. We rigorously establish the correctness of the methodology and show by experimentation using LSH that it is competitive in practice with available alternatives.

information retrieval, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.42)

Add feedback

Navigable Graphs for High-Dimensional Nearest Neighbor Search: Constructions and Limits

Neural Information Processing SystemsMar-21-2026, 01:32:06 GMT

There has been significant recent interest in graph-based nearest neighbor search methods, many of which are centered on the construction of (approximately) navigable graphs over high-dimensional point sets. A graph is navigable if we can successfully move from any starting node to any target node using a greedy routing strategy where we always move to the neighbor that is closest to the destination according to the given distance function. The complete graph is obviously navigable for any point set, but the important question for applications is if sparser graphs can be constructed. While this question is fairly well understood in low-dimensions, we establish some of the first upper and lower bounds for high-dimensional point sets. First, we give a simple and efficient way to construct a navigable graph with average degree $O(\sqrt{n \log n })$ for any set of $n$ points, in any dimension, for any distance function. We compliment this result with a nearly matching lower bound: even under the Euclidean metric in $O(\log n)$ dimensions, a random point set has no navigable graph with average degree $O(n^{\alpha})$ for any $\alpha < 1/2$. Our lower bound relies on sharp anti-concentration bounds for binomial random variables, which we use to show that the {near-neighborhoods} of a set of random points do not overlap significantly, forcing any navigable graph to have many edges.

artificial intelligence, information retrieval, natural language, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.62)

Add feedback

TSDS: Data Selection for Task-Specific Model Finetuning

Neural Information Processing SystemsMar-18-2026, 07:19:17 GMT

Finetuning foundation models for specific tasks is an emerging paradigm in modern machine learning. The efficacy of task-specific finetuning largely depends on the selection of appropriate training data. We present TSDS (Task-Specific Data Selection), a framework to select data for task-specific model finetuning, guided by a small but representative set of examples from the target task. To do so, we formulate data selection for task-specific finetuning as an optimization problem with a distribution alignment loss based on optimal transport to capture the discrepancy between the selected data and the target distribution. In addition, we add a regularizer to encourage the diversity of the selected data and incorporate kernel density estimation into the regularizer to reduce the negative effects of near-duplicates among the candidate data.We connect our optimization problem to nearest neighbor search and design efficient algorithms to compute the optimal solution based on approximate nearest neighbor search techniques.We evaluate our method on data selection for both continued pretraining and instruction tuning of language models.We show that instruction tuning using data selected by our method with a 1\% selection ratio often outperforms using the full dataset and beats the baseline selection methods by 1.5 points in F1 score on average.

artificial intelligence, information retrieval, natural language, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.83)

Add feedback

An ensemble diversity approach to supervised binary hashing

Neural Information Processing SystemsMar-17-2026, 08:55:21 GMT

Binary hashing is a well-known approach for fast approximate nearest-neighbor search in information retrieval. Much work has focused on affinity-based objective functions involving the hash functions or binary codes. These objective functions encode neighborhood information between data points and are often inspired by manifold learning algorithms. They ensure that the hash functions differ from each other through constraints or penalty terms that encourage codes to be orthogonal or dissimilar across bits, but this couples the binary variables and complicates the already difficult optimization. We propose a much simpler approach: we train each hash function (or bit) independently from each other, but introduce diversity among them using techniques from classifier ensembles. Surprisingly, we find that not only is this faster and trivially parallelizable, but it also improves over the more complex, coupled objective function, and achieves state-of-the-art precision and recall in experiments with image retrieval.

information retrieval, machine learning, natural language, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.61)

Add feedback

Query Complexity of Bayesian Private Learning

Neural Information Processing SystemsMar-16-2026, 22:29:09 GMT

We study the query complexity of Bayesian Private Learning: a learner wishes to locate a random target within an interval by submitting queries, in the presence of an adversary who observes all of her queries but not the responses. How many queries are necessary and sufficient in order for the learner to accurately estimate the target, while simultaneously concealing the target from the adversary? Our main result is a query complexity lower bound that is tight up to the first order. We show that if the learner wants to estimate the target within an error of $\epsilon$, while ensuring that no adversary estimator can achieve a constant additive error with probability greater than $1/L$, then the query complexity is on the order of $L\log(1/\epsilon)$ as $\epsilon \to 0$. Our result demonstrates that increased privacy, as captured by $L$, comes at the expense of a \emph{multiplicative} increase in query complexity. The proof builds on Fano's inequality and properties of certain proportional-sampling estimators.

artificial intelligence, natural language, neural information processing system 31, (7 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Meta-Learning MCMC Proposals

Neural Information Processing SystemsMar-16-2026, 20:56:11 GMT

Effective implementations of sampling-based probabilistic inference often require manually constructed, model-specific proposals. Inspired by recent progresses in meta-learning for training learning agents that can generalize to unseen environments, we propose a meta-learning approach to building effective and generalizable MCMC proposals. We parametrize the proposal as a neural network to provide fast approximations to block Gibbs conditionals. The learned neural proposals generalize to occurrences of common structural motifs across different models, allowing for the construction of a library of learned inference primitives that can accelerate inference on unseen models with no model-specific training required. We explore several applications including open-universe Gaussian mixture models, in which our learned proposals outperform a hand-tuned sampler, and a real-world named entity recognition task, in which our sampler yields higher final F1 scores than classical single-site Gibbs sampling.

information retrieval, machine learning, natural language, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.61)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.61)

Add feedback

Self-Retrieval: End-to-End InformationRetrieval withOneLargeLanguageModel

Neural Information Processing SystemsMar-13-2026, 23:41:51 GMT

The rise of large language models (LLMs) has significantly transformed both the construction and application of information retrieval (IR) systems. However, current interactions between IR systems and LLMs remain limited, with LLMs merely serving as part of components within IR systems, and IR systems being constructed independently of LLMs. This separated architecture restricts knowledge sharing and deep collaboration between them. In this paper, we introduce Self-Retrieval, a novel end-to-end LLM-driven information retrieval architecture.

information retrieval, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Singapore (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

The Search Engine for OnlyFans Models Who Look Like Your Crush

WIREDFeb-20-2026, 11:00:00 GMT

Presearch's "Doppelgänger" is trying to help people discover adult creators rather than use nonconsensual deepfakes. For three days in February, porn star Alix Lynx flew to Miami for her first exclusive creator gathering where she was in full grind mode: shooting Reels and talking strategy with other creators. "It was kind of like SoHo House for OnlyFans girls," she says of the experience, which is called The Circle and drew more than a dozen sex workers, including Remy LaCroix and Forrest Smith. Lynx, who is a former webcam model turned OnlyFans starlet, has a combined 2 million followers across Instagram, TikTok, and X . She joined OnlyFans in 2017 with "the luxury of having my own following," she says, but those numbers haven't always translated to subscriptions. It's why she was in Miami.

artificial intelligence, information retrieval, natural language, (17 more...)

WIRED

Country:

Asia (0.47)
North America > United States > California (0.14)

Industry:

Government (0.95)
Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.42)

Add feedback