AITopics | colbert

Collaborating Authors

colbert

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

e8b1cbd05f6e6a358a81dee52493dd06-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 16:57:50 GMT

condenser, hop, sampling depth, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Rethinking the Role of Token Retrieval in Multi-Vector Retrieval

Neural Information Processing SystemsDec-24-2025, 11:27:40 GMT

Multi-vector retrieval models such as ColBERT [Khattab et al., 2020] allow token-level interactions between queries and documents, and hence achieve state of the art on many information retrieval benchmarks. However, their non-linear scoring function cannot be scaled to millions of documents, necessitating a three-stage process for inference: retrieving initial candidates via token retrieval, accessing all token vectors, and scoring the initial candidate documents. The non-linear scoring function is applied over all token vectors of each candidate document, making the inference process complicated and slow. In this paper, we aim to simplify the multi-vector retrieval by rethinking the role of token retrieval. We present XTR, ConteXtualized Token Retriever, which introduces a simple, yet novel, objective function that encourages the model to retrieve the most important document tokens first. The improvement to token retrieval allows XTR to rank candidates only using the retrieved tokens rather than all tokens in the document, and enables a newly designed scoring stage that is two-to-three orders of magnitude cheaper than that of ColBERT.

name change, retrieval, token retrieval, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

ModernBERT + ColBERT: Enhancing biomedical RAG through an advanced re-ranking retriever

Rivera, Eduardo Martínez, Menolascina, Filippo

arXiv.org Artificial IntelligenceOct-7-2025

Retrieval-Augmented Generation (RAG) is a powerful technique for enriching Large Language Models (LLMs) with external knowledge, allowing for factually grounded responses, a critical requirement in high-stakes domains such as healthcare. However, the efficacy of RAG systems is fundamentally restricted by the performance of their retrieval module, since irrelevant or semantically misaligned documents directly compromise the accuracy of the final generated response. General-purpose dense retrievers can struggle with the nuanced language of specialised domains, while the high accuracy of in-domain models is often achieved at prohibitive computational costs. In this work, we aim to address this trade-off by developing and evaluating a two-stage retrieval architecture that combines a lightweight ModernBERT bidirectional encoder for efficient initial candidate retrieval with a ColBERTv2 late-interaction model for fine-grained re-ranking. We conduct comprehensive evaluations of our retriever module performance and RAG system performance in the biomedical context, fine-tuning the IR module using 10k question-passage pairs from PubMedQA. Our analysis of the retriever module confirmed the positive impact of the ColBERT re-ranker, which improved Recall@3 by up to 4.2 percentage points compared to its retrieve-only counterpart. When integrated into the biomedical RAG, our IR module leads to a state-of-the-art average accuracy of 0.4448 on the five tasks of the MIRAGE question-answering benchmark, outperforming strong baselines such as MedCPT (0.4436). Our ablation studies reveal that this performance is critically dependent on a joint fine-tuning process that aligns the retriever and re-ranker; otherwise, the re-ranker might degrade the performance.

accuracy, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.04757

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Appendices for Baleen A Data Details

Neural Information Processing SystemsAug-18-2025, 09:47:40 GMT

Table 6: Sizes of the splits of the datasets used in this work. It contains approximately 5M passages (1.5 GiB uncompressed). We implement Baleen using Python 3.7 and PyTorch 1.6 and rely extensively on the HuggingFace We train and test with automatic mixed precision that is built into PyTorch. To train the single-hop retriever used to initiate the supervision procedure of 3.2, we follow the training strategy of Khattab et al. ColBERT model to create training triples, and then we train our retriever (in this case, FLIPR for first-hop) with these triples.

artificial intelligence, hop, machine learning, (20 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Provable Post-Training Quantization: Theoretical Analysis of OPTQ and Qronos

Zhang, Haoyu, Zhang, Shihao, Colbert, Ian, Saab, Rayan

arXiv.org Artificial IntelligenceAug-8-2025

Post-training quantization (PTQ) has become a crucial tool for reducing the memory and compute costs of modern deep neural networks, including large language models (LLMs). Among PTQ algorithms, the OPTQ framework-also known as GPTQ-has emerged as a leading method due to its computational efficiency and strong empirical performance. Despite its widespread adoption, however, OPTQ lacks rigorous quantitative theoretical guarantees. This paper presents the first quantitative error bounds for both deterministic and stochastic variants of OPTQ, as well as for Qronos, a recent related state-of-the-art PTQ algorithm. We analyze how OPTQ's iterative procedure induces quantization error and derive non-asymptotic 2-norm error bounds that depend explicitly on the calibration data and a regularization parameter that OPTQ uses. Our analysis provides theoretical justification for several practical design choices, including the widely used heuristic of ordering features by decreasing norm, as well as guidance for selecting the regularization parameter. For the stochastic variant, we establish stronger infinity-norm error bounds, which enable control over the required quantization alphabet and are particularly useful for downstream layers and nonlinearities. Finally, we extend our analysis to Qronos, providing new theoretical bounds, for both its deterministic and stochastic variants, that help explain its empirical advantages.

large language model, machine learning, quantization, (17 more...)

arXiv.org Artificial Intelligence

2508.04853

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Extracting Document Relations from Search Corpus by Marginalizing over User Queries

Iwamoto, Yuki, Tsunoda, Kaoru, Kaneiwa, Ken

arXiv.org Artificial IntelligenceJul-16-2025

Understanding relationships between documents in large-scale corpora is essential for knowledge discovery and information organization. However, existing approaches rely heavily on manual annotation or predefined relationship taxonomies. W e propose EDR-MQ (Extracting Document Relations by Marginalizing over User Queries), a novel framework that discovers document relationships through query marginalization. EDR-MQ is based on the insight that strongly related documents often co-occur in results across diverse user queries, enabling us to estimate joint probabilities between document pairs by marginalizing over a collection of queries. T o enable this query marginalization approach, we develop Multiply Conditioned Retrieval-Augmented Generation (MC-RAG), which employs conditional retrieval where subsequent document retrievals depend on previously retrieved content. By observing co-occurrence patterns across diverse queries, EDR-MQ estimates joint probabilities between document pairs without requiring labeled training data or predefined taxonomies. Experimental results show that our query marginalization approach successfully identifies meaningful document relationships, revealing topical clusters, evidence chains, and cross-domain connections that are not apparent through traditional similarity-based methods. Our query-driven framework offers a practical approach to document organization that adapts to different user perspectives and information needs.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.10726

Country:

Asia > Japan (0.28)
North America > United States (0.28)

Genre: Research Report (0.70)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

A model and package for German ColBERT

Dang, Thuong, Chen, Qiqi

arXiv.org Artificial IntelligenceApr-30-2025

The original ColBERT model was proposed by Khattab and Zaharia [8 ], introducing the MaxSim scoring function based on token-level intera ctions. The model was trained using a softmax cross-entropy loss over triplet s derived from the MS MARCO Ranking [1] and TREC Complex Answer Retrieval (TREC CAR) [5] datasets, leveraging the English BERT model [4] as its backb one encoder. The ColBERT MaxSim score can be interpreted as a substitut e for the BM25 score used in full-text search; consequently, there are simila rities between the ColBERT retrieval method and BM25-based full-text search. T his will be discussed in detail in Section 2. ColBERT is flexible, and can be used as a first retrieval method or a reranker. ColBERT score is computed o n the token similarity level, and can be applied in contexts where keyword similarities are significant. ColBERT model was also trained for Japanese [3] where the author a lso discussed different strategies to choose hard negatives using mult ilingual e5 embedding model and BM25.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2504.20083

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Towards Lossless Token Pruning in Late-Interaction Retrieval Models

Zong, Yuxuan, Piwowarski, Benjamin

arXiv.org Artificial IntelligenceApr-18-2025

Late interaction neural IR models like ColBERT offer a competitive effectiveness-efficiency trade-off across many benchmarks. However, they require a huge memory space to store the contextual representation for all the document tokens. Some works have proposed using either heuristics or statistical-based techniques to prune tokens from each document. This however doesn't guarantee that the removed tokens have no impact on the retrieval score. Our work uses a principled approach to define how to prune tokens without impacting the score between a document and a query. We introduce three regularization losses, that induce a solution with high pruning ratios, as well as two pruning strategies. We study them experimentally (in and out-domain), showing that we can preserve ColBERT's performance while using only 30\% of the tokens.

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3726302.3730100

2504.12778

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)

Add feedback

You've Seen This Bizarre Video Phenomenon. There's a Reason It's Suddenly Everywhere.

SlateApr-8-2025, 14:00:00 GMT

Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. Imagine yourself strapped to a chair with your head held in place by some device. The only thing you're free to move is your eyes. You hear something to your left; you'd want to turn your head left to look, or at least take a sidelong glance. Your brain sends the necessary impulses to your muscles--only you can't move.

aspect ratio, social media, video, (15 more...)

Slate

Country: North America > United States > New York (0.05)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.70)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.49)

Add feedback

Retrieval Augmented Spelling Correction for E-Commerce Applications

Guo, Xuan, Patki, Rohit, Everaert, Dante, Potts, Christopher

arXiv.org Artificial IntelligenceOct-15-2024

The rapid introduction of new brand names into everyday language poses a unique challenge for e-commerce spelling correction services, which must distinguish genuine misspellings from novel brand names that use unconventional spelling. We seek to address this challenge via Retrieval Augmented Generation (RAG). On this approach, product names are retrieved from a catalog and incorporated into the context used by a large language model (LLM) that has been fine-tuned to do contextual spelling correction. Through quantitative evaluation and qualitative error analyses, we find improvements in spelling correction utilizing the RAG framework beyond a stand-alone LLM. We also demonstrate the value of additional finetuning of the LLM to incorporate retrieved context.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.11655

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Maryland > Baltimore (0.04)
(6 more...)

Genre: Research Report (0.40)

Industry: Information Technology > Services > e-Commerce Services (0.62)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback