AITopics | Dai, Zhuyun

Collaborating Authors

Dai, Zhuyun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

Lee, Jinhyuk, Chen, Anthony, Dai, Zhuyun, Dua, Dheeru, Sachan, Devendra Singh, Boratko, Michael, Luan, Yi, Arnold, Sébastien M. R., Perot, Vincent, Dalmia, Siddharth, Hu, Hexiang, Lin, Xudong, Pasupat, Panupong, Amini, Aida, Cole, Jeremy R., Riedel, Sebastian, Naim, Iftekhar, Chang, Ming-Wei, Guu, Kelvin

arXiv.org Artificial IntelligenceJun-18-2024

Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-end modeling that minimizes cascading errors in complex pipelines, and allows for the application of sophisticated prompting techniques across the entire system. To assess this paradigm shift, we introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning. Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks. However, LCLMs still face challenges in areas like compositional reasoning that are required in SQL-like tasks. Notably, prompting strategies significantly influence performance, emphasizing the need for continued research as context lengths grow. Overall, LOFT provides a rigorous testing ground for LCLMs, showcasing their potential to supplant existing paradigms and tackle novel tasks as model capabilities scale.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.13121

Country:

Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Sports > Hockey (1.00)
Media (0.93)
Automobiles & Trucks > Manufacturer (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Add feedback

Gecko: Versatile Text Embeddings Distilled from Large Language Models

Lee, Jinhyuk, Dai, Zhuyun, Ren, Xiaoqi, Chen, Blair, Cer, Daniel, Cole, Jeremy R., Hui, Kai, Boratko, Michael, Kapadia, Rajvi, Ding, Wen, Luan, Yi, Duddu, Sai Meher Karthik, Abrego, Gustavo Hernandez, Shi, Weiqiang, Gupta, Nithi, Kusupati, Aditya, Jain, Prateek, Jonnalagadda, Siddhartha Reddy, Chang, Ming-Wei, Naim, Iftekhar

arXiv.org Artificial IntelligenceMar-29-2024

Text embedding models represent natural language as dense vectors, positioning semantically similar text near each other within the embedding space (Gao et al., 2021; Le and Mikolov, 2014; Reimers and Gurevych, 2019). These embeddings are commonly used for a wide range of downstream tasks including document retrieval, sentence similarity, classification, and clustering (Muennighoff et al., 2023). Instead of building separate embedding models for each downstream task, recent efforts seek to create a single embedding model supporting many tasks. The recent development of general-purpose text embedding models presents a challenge: these models require large amounts of training data to comprehensively cover desired domains and skills. Recent embedding efforts have focused on using extensive collections of training examples (Li et al., 2023; Wang et al., 2022).

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2403.20327

Country: Asia > Middle East > UAE (0.14)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports > Olympic Games (0.68)
Media > Film (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

RARR: Researching and Revising What Language Models Say, Using Language Models

Gao, Luyu, Dai, Zhuyun, Pasupat, Panupong, Chen, Anthony, Chaganty, Arun Tejasvi, Fan, Yicheng, Zhao, Vincent Y., Lao, Ni, Lee, Hongrae, Juan, Da-Cheng, Guu, Kelvin

arXiv.org Artificial IntelligenceMay-31-2023

Language models (LMs) now excel at many tasks such as few-shot learning, question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a large language model, and standard web search.

attribution, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2210.08726

Country:

North America > United States (1.00)
Asia (0.94)
Africa (0.69)
Europe > United Kingdom (0.68)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Television (1.00)
Media > Film (1.00)
Health & Medicine > Therapeutic Area (0.94)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)

Add feedback

Dr.ICL: Demonstration-Retrieved In-context Learning

Luo, Man, Xu, Xin, Dai, Zhuyun, Pasupat, Panupong, Kazemi, Mehran, Baral, Chitta, Imbrasaite, Vaiva, Zhao, Vincent Y

arXiv.org Artificial IntelligenceMay-23-2023

In-context learning (ICL), teaching a large language model (LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs. While early studies primarily used a fixed or random set of demonstrations for all test queries, recent research suggests that retrieving semantically similar demonstrations to the input from a pool of available demonstrations results in better performance. This work expands the applicability of retrieval-based ICL approaches by demonstrating that even simple word-overlap similarity measures such as BM25 outperform randomly selected demonstrations. Furthermore, we extend the success of retrieval-based ICL to instruction-finetuned LLMs as well as Chain-of-Thought (CoT) prompting. For instruction-finetuned LLMs, we find that although a model has already seen the training data at training time, retrieving demonstrations from the training data at test time yields better results compared to using no demonstrations or random demonstrations. Last but not least, we train a task-specific demonstration retriever that outperforms off-the-shelf retrievers.

demonstration, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.14128

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Rethinking the Role of Token Retrieval in Multi-Vector Retrieval

Lee, Jinhyuk, Dai, Zhuyun, Duddu, Sai Meher Karthik, Lei, Tao, Naim, Iftekhar, Chang, Ming-Wei, Zhao, Vincent Y.

arXiv.org Artificial IntelligenceMay-23-2023

Multi-vector retrieval models such as ColBERT [Khattab and Zaharia, 2020] allow token-level interactions between queries and documents, and hence achieve state of the art on many information retrieval benchmarks. However, their non-linear scoring function cannot be scaled to millions of documents, necessitating a three-stage process for inference: retrieving initial candidates via token retrieval, accessing all token vectors, and scoring the initial candidate documents. The non-linear scoring function is applied over all token vectors of each candidate document, making the inference process complicated and slow. In this paper, we aim to simplify the multi-vector retrieval by rethinking the role of token retrieval. We present XTR, ConteXtualized Token Retriever, which introduces a simple, yet novel, objective function that encourages the model to retrieve the most important document tokens first. The improvement to token retrieval allows XTR to rank candidates only using the retrieved tokens rather than all tokens in the document, and enables a newly designed scoring stage that is two-to-three orders of magnitude cheaper than that of ColBERT. On the popular BEIR benchmark, XTR advances the state-of-the-art by 2.8 nDCG@10 without any distillation. Detailed analysis confirms our decision to revisit the token retrieval stage, as XTR demonstrates much better recall of the token retrieval stage compared to ColBERT.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2304.01982

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.93)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Education (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)

Add feedback

Scaling Instruction-Finetuned Language Models

Chung, Hyung Won, Hou, Le, Longpre, Shayne, Zoph, Barret, Tay, Yi, Fedus, William, Li, Yunxuan, Wang, Xuezhi, Dehghani, Mostafa, Brahma, Siddhartha, Webson, Albert, Gu, Shixiang Shane, Dai, Zhuyun, Suzgun, Mirac, Chen, Xinyun, Chowdhery, Aakanksha, Castro-Ros, Alex, Pellat, Marie, Robinson, Kevin, Valter, Dasha, Narang, Sharan, Mishra, Gaurav, Yu, Adams, Zhao, Vincent, Huang, Yanping, Dai, Andrew, Yu, Hongkun, Petrov, Slav, Chi, Ed H., Dean, Jeff, Devlin, Jacob, Roberts, Adam, Zhou, Denny, Le, Quoc V., Wei, Jason

arXiv.org Artificial IntelligenceDec-6-2022

Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data. We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation). For instance, Flan-PaLM 540B instruction-finetuned on 1.8K tasks outperforms PALM 540B by a large margin (+9.4% on average). Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints, which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2210.11416

Country:

North America > United States (1.00)
Europe (0.92)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

PGT: Pseudo Relevance Feedback Using a Graph-Based Transformer

Yu, HongChien, Dai, Zhuyun, Callan, Jamie

arXiv.org Artificial IntelligenceJan-19-2021

Most research on pseudo relevance feedback (PRF) has been done in vector space and probabilistic retrieval models. This paper shows that Transformer-based rerankers can also benefit from the extra context that PRF provides. It presents PGT, a graph-based Transformer that sparsifies attention between graph nodes to enable PRF while avoiding the high computational complexity of most Transformer architectures. Experiments show that PGT improves upon non-PRF Transformer reranker, and it is at least as accurate as Transformer PRF models that use full attention, but with lower computational costs.

artificial intelligence, feedback document, neural network, (18 more...)

arXiv.org Artificial Intelligence

2101.07918

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback