Goto

Collaborating Authors

 Information Retrieval


Reviews: Efficient nonmyopic batch active search

Neural Information Processing Systems

This work investigates the different batch-mode extensions of an active search method called efficient nonmyopic policy (ENS) [12]. ENS achieves good performance efficiently because it assumes the sample selections are independent after a step [12]. This paper proposes two strategies: 1) converting the batch active search problem to sequential one by guessing the hidden labels of selected samples 2) try to enumerate all possible hidden labels of selected samples by Monte Carlo. Strength: Allowing to sample batches is important in practice. The work addresses several theoretical and practical challenges in many aspects of batch active search such as how difficult batch active search could be, why pessimistic oracle works well, and how to make the methods more efficient by pruning.


Reviews: A Greedy Approach for Budgeted Maximum Inner Product Search

Neural Information Processing Systems

The aim of the paper is to propose a new greedy approach for Maximum Inner Product Search problem: given a candidate vector, retrieve a set of vectors with maximum inner product to the query vector. This is a crucial step in several machine learning and data mining algorithms, and the state of the art methods work in sub-linear time recently. The originality of the paper is to study the MIPS problem under a computational budget. The proposed approach achieves better balance between search efficiency and quality of the retrieved vectors, and does not require a nearest neighbor search phase, as commonly done by state of the art approaches. The authors claim impressive runtime results (their algorithm is 200x faster than the naive approach), and a top-5 precision greater than 75%.


Reviews: Query Complexity of Bayesian Private Learning

Neural Information Processing Systems

The authors obtain a tight fundamental limit of the query complexity using Fano's inequality, which is a standard tool for the lower bound analysis. The fundamental limit matches the query complexity of an upper bound algorithm (Replicated Bisection strategy) in the asymptotic order which is originally proposed in [11, 15]. The query complexity analysis shows a privacy-efficiency trade-off. The proof seems sound, and the paper is easy to follow.


Reviews: Query Complexity of Clustering with Side Information

Neural Information Processing Systems

NOTE: I am reviewing two papers that appear to be by the same authors and on the same general topic. The other one is on noisy queries without any side information. This paper gives upper and lower bounds on the query complexity of clustering based on noiseless pairwise comparisons, along with random side-information associated with every pair. While I am familiar with some of the related literature, my knowledge of it is far from complete, so it's a little hard to fully judge the value of the contributions. The paper is also so dense that I couldn't spend as much time as ideal checking the correctness -- especially in the long upper bound proof.


Reviews: Few-Shot Learning Through an Information Retrieval Lens

Neural Information Processing Systems

This work is a great extension of few shot learning paradigms in deep metric learning. Rather than considering pairs of samples for metric learning (siamese networks), or 3 examples (anchor, positive, and negative -- triplet learning), the authors propose to use the full minibatchs' relationships or ranking with respect to the anchor sample. This makes sense from the structured prediction, and efficient learning. Incorporating mean average precision into the loss directly, allows for optimization of the actual task at hand. Some comments: 1. t would be great to see more experimental results, on other datasets (face recognition). Not clear how to set \lambda, as in a number of cases, a wrong value for \lambda leads to weak results.


Named Clinical Entity Recognition Benchmark

arXiv.org Artificial Intelligence

This technical report introduces a Named Clinical Entity Recognition Benchmark for evaluating language models in healthcare, addressing the crucial natural language processing (NLP) task of extracting structured information from clinical narratives to support applications like automated coding, clinical trial cohort identification, and clinical decision support. The leaderboard provides a standardized platform for assessing diverse language models, including encoder and decoder architectures, on their ability to identify and classify clinical entities across multiple medical domains. A curated collection of openly available clinical datasets is utilized, encompassing entities such as diseases, symptoms, medications, procedures, and laboratory measurements. Importantly, these entities are standardized according to the Observational Medical Outcomes Partnership (OMOP) Common Data Model, ensuring consistency and interoperability across different healthcare systems and datasets, and a comprehensive evaluation of model performance. Performance of models is primarily assessed using the F1-score, and it is complemented by various assessment modes to provide comprehensive insights into model performance. The report also includes a brief analysis of models evaluated to date, highlighting observed trends and limitations. By establishing this benchmarking framework, the leaderboard aims to promote transparency, facilitate comparative analyses, and drive innovation in clinical entity recognition tasks, addressing the need for robust evaluation methods in healthcare NLP.


Tensor-Train Point Cloud Compression and Efficient Approximate Nearest-Neighbor Search

arXiv.org Artificial Intelligence

Nearest-neighbor search in large vector databases is crucial for various machine learning applications. This paper introduces a novel method using tensor-train (TT) low-rank tensor decomposition to efficiently represent point clouds and enable fast approximate nearest-neighbor searches. We propose a probabilistic interpretation and utilize density estimation losses like Sliced Wasserstein to train TT decompositions, resulting in robust point cloud compression. We reveal an inherent hierarchical structure within TT point clouds, facilitating efficient approximate nearest-neighbor searches. In our paper, we provide detailed insights into the methodology and conduct comprehensive comparisons with existing methods. We demonstrate its effectiveness in various scenarios, including out-of-distribution (OOD) detection problems and approximate nearest-neighbor (ANN) search tasks.


A General Framework for Robust Interactive Learning

Neural Information Processing Systems

We propose a general framework for interactively learning models, such as (binary or non-binary) classifiers, orderings/rankings of items, or clusterings of data points. Our framework is based on a generalization of Angluin's equivalence query model and Littlestone's online learning model: in each iteration, the algorithm proposes a model, and the user either accepts it or reveals a specific mistake in the proposal.


Streamlining Conformal Information Retrieval via Score Refinement

arXiv.org Artificial Intelligence

Information retrieval (IR) methods, like retrieval augmented generation, are fundamental to modern applications but often lack statistical guarantees. Conformal prediction addresses this by retrieving sets guaranteed to include relevant information, yet existing approaches produce large-sized sets, incurring high computational costs and slow response times. In this work, we introduce a score refinement method that applies a simple monotone transformation to retrieval scores, leading to significantly smaller conformal sets while maintaining their statistical guarantees. Experiments on various BEIR benchmarks validate the effectiveness of our approach in producing compact sets containing relevant information.


Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms

arXiv.org Artificial Intelligence

Zeroth-order (ZO) optimization is one key technique for machine learning problems where gradient calculation is expensive or impossible. Several variance reduced ZO proximal algorithms have been proposed to speed up ZO optimization for non-smooth problems, and all of them opted for the coordinated ZO estimator against the random ZO estimator when approximating the true gradient, since the former is more accurate. While the random ZO estimator introduces bigger error and makes convergence analysis more challenging compared to coordinated ZO estimator, it requires only $\mathcal{O}(1)$ computation, which is significantly less than $\mathcal{O}(d)$ computation of the coordinated ZO estimator, with $d$ being dimension of the problem space. To take advantage of the computationally efficient nature of the random ZO estimator, we first propose a ZO objective decrease (ZOOD) property which can incorporate two different types of errors in the upper bound of convergence rate. Next, we propose two generic reduction frameworks for ZO optimization which can automatically derive the convergence results for convex and non-convex problems respectively, as long as the convergence rate for the inner solver satisfies the ZOOD property. With the application of two reduction frameworks on our proposed ZOR-ProxSVRG and ZOR-ProxSAGA, two variance reduced ZO proximal algorithms with fully random ZO estimators, we improve the state-of-the-art function query complexities from $\mathcal{O}\left(\min\{\frac{dn^{1/2}}{\epsilon^2}, \frac{d}{\epsilon^3}\}\right)$ to $\tilde{\mathcal{O}}\left(\frac{n+d}{\epsilon^2}\right)$ under $d > n^{\frac{1}{2}}$ for non-convex problems, and from $\mathcal{O}\left(\frac{d}{\epsilon^2}\right)$ to $\tilde{\mathcal{O}}\left(n\log\frac{1}{\epsilon}+\frac{d}{\epsilon}\right)$ for convex problems.