AITopics | Adlakha, Vaibhav

Collaborating Authors

Adlakha, Vaibhav

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MMTEB: Massive Multilingual Text Embedding Benchmark

Enevoldsen, Kenneth, Chung, Isaac, Kerboua, Imene, Kardos, Márton, Mathur, Ashwin, Stap, David, Gala, Jay, Siblini, Wissam, Krzemiński, Dominik, Winata, Genta Indra, Sturua, Saba, Utpala, Saiteja, Ciancone, Mathieu, Schaeffer, Marion, Sequeira, Gabriel, Misra, Diganta, Dhakal, Shreeya, Rystrøm, Jonathan, Solomatin, Roman, Çağatan, Ömer, Kundu, Akash, Bernstorff, Martin, Xiao, Shitao, Sukhlecha, Akshita, Pahwa, Bhavish, Poświata, Rafał, GV, Kranthi Kiran, Ashraf, Shawon, Auras, Daniel, Plüster, Björn, Harries, Jan Philipp, Magne, Loïc, Mohr, Isabelle, Hendriksen, Mariya, Zhu, Dawei, Gisserot-Boukhlef, Hippolyte, Aarsen, Tom, Kostkan, Jan, Wojtasik, Konrad, Lee, Taemin, Šuppa, Marek, Zhang, Crystina, Rocca, Roberta, Hamdy, Mohammed, Michail, Andrianos, Yang, John, Faysse, Manuel, Vatolin, Aleksei, Thakur, Nandan, Dey, Manan, Vasani, Dipam, Chitale, Pranjal, Tedeschi, Simone, Tai, Nguyen, Snegirev, Artem, Günther, Michael, Xia, Mengzhou, Shi, Weijia, Lù, Xing Han, Clive, Jordan, Krishnakumar, Gayatri, Maksimova, Anna, Wehrli, Silvan, Tikhonova, Maria, Panchal, Henil, Abramov, Aleksandr, Ostendorff, Malte, Liu, Zheng, Clematide, Simon, Miranda, Lester James, Fenogenova, Alena, Song, Guangyu, Safi, Ruqiya Bin, Li, Wen-Ding, Borghini, Alessia, Cassano, Federico, Su, Hongjin, Lin, Jimmy, Yen, Howard, Hansen, Lasse, Hooker, Sara, Xiao, Chenghao, Adlakha, Vaibhav, Weller, Orion, Reddy, Siva, Muennighoff, Niklas

arXiv.org Artificial IntelligenceFeb-19-2025

Text embeddings are typically evaluated on a limited set of tasks, which are constrained by language, domain, and task diversity. To address these limitations and provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) - a large-scale, community-driven expansion of MTEB, covering over 500 quality-controlled evaluation tasks across 250+ languages. MMTEB includes a diverse set of challenging, novel tasks such as instruction following, long-document retrieval, and code retrieval, representing the largest multilingual collection of evaluation tasks for embedding models to date. Using this collection, we develop several highly multilingual benchmarks, which we use to evaluate a representative set of models. We find that while large language models (LLMs) with billions of parameters can achieve state-of-the-art performance on certain language subsets and task categories, the best-performing publicly available model is multilingual-e5-large-instruct with only 560 million parameters. To facilitate accessibility and reduce computational cost, we introduce a novel downsampling method based on inter-task correlation, ensuring a diverse selection while preserving relative model rankings. Furthermore, we optimize tasks such as retrieval by sampling hard negatives, creating smaller but effective splits. These optimizations allow us to introduce benchmarks that drastically reduce computational demands. For instance, our newly introduced zero-shot English benchmark maintains a ranking order similar to the full-scale version but at a fraction of the computational cost.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.13595

Country:

Europe (1.00)
North America > United States > Colorado (0.14)
North America > United States > Oregon (0.14)
(3 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine > Public Health (0.45)
Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

BehnamGhader, Parishad, Adlakha, Vaibhav, Mosbach, Marius, Bahdanau, Dzmitry, Chapados, Nicolas, Reddy, Siva

arXiv.org Artificial IntelligenceApr-8-2024

Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consists of three simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. We demonstrate the effectiveness of LLM2Vec by applying it to 3 popular LLMs ranging from 1.3B to 7B parameters and evaluate the transformed models on English word- and sequence-level tasks. We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB). Moreover, when combining LLM2Vec with supervised contrastive learning, we achieve state-of-the-art performance on MTEB among models that train only on publicly available data. Our strong empirical results and extensive analysis demonstrate that LLMs can be effectively transformed into universal text encoders in a parameter-efficient manner without the need for expensive adaptation or synthetic GPT-4 generated data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.05961

Country:

Europe (1.00)
North America > Canada (0.67)
Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering

Adlakha, Vaibhav, BehnamGhader, Parishad, Lu, Xing Han, Meade, Nicholas, Reddy, Siva

arXiv.org Artificial IntelligenceJul-31-2023

Retriever-augmented instruction-following models are attractive alternatives to fine-tuned approaches for information-seeking tasks such as question answering (QA). By simply prepending retrieved documents in its input along with an instruction, these models can be adapted to various information domains and tasks without additional fine-tuning. While the model responses tend to be natural and fluent, the additional verbosity makes traditional QA evaluation metrics such as exact match (EM) and F1 unreliable for accurately quantifying model performance. In this work, we investigate the performance of instruction-following models across three information-seeking QA tasks. We use both automatic and human evaluation to evaluate these models along two dimensions: 1) how well they satisfy the user's information need (correctness), and 2) whether they produce a response based on the provided knowledge (faithfulness). Guided by human evaluation and analysis, we highlight the shortcomings of traditional metrics for both correctness and faithfulness. We then propose simple token-overlap based and model-based metrics that reflect the true performance of these models. Our analysis reveals that instruction-following models are competitive, and sometimes even outperform fine-tuned models for correctness. However, these models struggle to stick to the provided knowledge and often hallucinate in their responses. We hope our work encourages a more holistic evaluation of instruction-following models for QA. Our code and data is available at https://github.com/McGill-NLP/instruct-qa

machine learning, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2307.16877

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)
North America > Canada (0.68)

Genre: Research Report > New Finding (0.67)

Industry:

Media > Film (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.85)

Add feedback

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

Madsen, Andreas, Meade, Nicholas, Adlakha, Vaibhav, Reddy, Siva

arXiv.org Artificial IntelligenceOct-15-2021

To explain NLP models, many methods inform which inputs tokens are important for a prediction. However, an open question is if these methods accurately reflect the model's logic, a property often called faithfulness. In this work, we adapt and improve a recently proposed faithfulness benchmark from computer vision called ROAR (RemOve And Retrain), by Hooker et al. (2019). We improve ROAR by recursively removing dataset redundancies, which otherwise interfere with ROAR. We adapt and apply ROAR, to popular NLP importance measures, namely attention, gradient, and integrated gradients. Additionally, we use mutual information as an additional baseline. Evaluation is done on a suite of classification tasks often used in the faithfulness of attention literature. Finally, we propose a scalar faithfulness metric, which makes it easy to compare results across papers. We find that, importance measures considered to be unfaithful for computer vision tasks perform favorably for NLP tasks, the faithfulness of an importance measure is task-dependent, and the computational overhead of integrated gradient is rarely justified.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2110.08412

Country:

North America > Canada > Quebec (0.14)
North America > United States > Oregon (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback