AITopics | Sachan, Devendra Singh

Collaborating Authors

Sachan, Devendra Singh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

Lee, Jinhyuk, Chen, Anthony, Dai, Zhuyun, Dua, Dheeru, Sachan, Devendra Singh, Boratko, Michael, Luan, Yi, Arnold, Sébastien M. R., Perot, Vincent, Dalmia, Siddharth, Hu, Hexiang, Lin, Xudong, Pasupat, Panupong, Amini, Aida, Cole, Jeremy R., Riedel, Sebastian, Naim, Iftekhar, Chang, Ming-Wei, Guu, Kelvin

arXiv.org Artificial IntelligenceJun-18-2024

Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-end modeling that minimizes cascading errors in complex pipelines, and allows for the application of sophisticated prompting techniques across the entire system. To assess this paradigm shift, we introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning. Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks. However, LCLMs still face challenges in areas like compositional reasoning that are required in SQL-like tasks. Notably, prompting strategies significantly influence performance, emphasizing the need for continued research as context lengths grow. Overall, LOFT provides a rigorous testing ground for LCLMs, showcasing their potential to supplant existing paradigms and tackle novel tasks as model capabilities scale.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.13121

Country:

Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Sports > Hockey (1.00)
Media (0.93)
Automobiles & Trucks > Manufacturer (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Add feedback

Questions Are All You Need to Train a Dense Passage Retriever

Sachan, Devendra Singh, Lewis, Mike, Yogatama, Dani, Zettlemoyer, Luke, Pineau, Joelle, Zaheer, Manzil

arXiv.org Artificial IntelligenceApr-2-2023

We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data. Dense retrieval is a central challenge for open-domain tasks, such as Open QA, where state-of-the-art methods typically require large supervised datasets with custom hard-negative mining and denoising of positive examples. ART, in contrast, only requires access to unpaired inputs and outputs (e.g. questions and potential answer documents). It uses a new document-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence documents, and (2) the documents are then used to compute the probability of reconstructing the original question. Training for retrieval based on question reconstruction enables effective unsupervised learning of both document and question encoders, which can be later incorporated into complete Open QA systems without any further finetuning. Extensive experiments demonstrate that ART obtains state-of-the-art results on multiple QA retrieval benchmarks with only generic initialization from a pre-trained language model, removing the need for labeled data and task-specific losses.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2206.10658

Country: North America (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Improving Passage Retrieval with Zero-Shot Question Generation

Sachan, Devendra Singh, Lewis, Mike, Joshi, Mandar, Aghajanyan, Armen, Yih, Wen-tau, Pineau, Joelle, Zettlemoyer, Luke

arXiv.org Artificial IntelligenceApr-2-2023

Queries and documents of query scoring with count-based language are typically embedded in a shared representation models (Zhai and Lafferty, 2001). However, instead space to enable efficient search, before using of estimating a language model from each a task-specific model to perform a deeper, tokenlevel passage, UPR uses pre-trained language models document analysis (e.g. a document reader (PLMs). More recent work on re-rankers have finetuned that selects an answer span). We show that adding PLMs on question-passage pairs to generate a zero-shot re-ranker to the retrieval stage of such relevance labels (Nogueira et al., 2020), sometimes models leads to large gains in performance, by doing to jointly generate question and relevance deep token-level analysis with no task-specific labels (Nogueira dos Santos et al., 2020; Ju et al., data or tuning.

artificial intelligence, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2204.07496

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports > Baseball (1.00)
Media > Film (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

Sachan, Devendra Singh, Reddy, Siva, Hamilton, William, Dyer, Chris, Yogatama, Dani

arXiv.org Artificial IntelligenceJun-9-2021

We present an end-to-end differentiable training method for retrieval-augmented open-domain question answering systems that combine information from multiple retrieved documents when generating answers. We model retrieval decisions as latent variables over sets of relevant documents. Since marginalizing over sets of retrieved documents is computationally hard, we approximate this using an expectation-maximization algorithm. We iteratively estimate the value of our latent variable (the set of relevant documents for a given question) and then use this estimate to update the retriever and reader parameters. We hypothesize that such end-to-end training allows training signals to flow to the reader and then to the retriever better than staged-wise training. This results in a retriever that is able to select more relevant documents for a question and a reader that is trained on more accurate documents to generate an answer. Experiments on three benchmark datasets demonstrate that our proposed method outperforms all existing approaches of comparable size by 2-3% absolute exact match points, achieving new state-of-the-art results. Our results also demonstrate the feasibility of learning to retrieve to improve answer generation without explicit supervision of retrieval decisions.

artificial intelligence, computer game, retriever, (18 more...)

arXiv.org Artificial Intelligence

2106.05346

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Sachan, Devendra Singh, Patwary, Mostofa, Shoeybi, Mohammad, Kant, Neel, Ping, Wei, Hamilton, William L, Catanzaro, Bryan

arXiv.org Artificial IntelligenceJan-2-2021

Recent work on training neural retrievers for open-domain question answering (OpenQA) has employed both supervised and unsupervised approaches. However, it remains unclear how unsupervised and supervised methods can be used most effectively for neural retrievers. In this work, we systematically study retriever pre-training. We first propose an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans, followed by supervised finetuning using question-context pairs. This approach leads to absolute gains of 2+ points over the previous best result in the top-20 retrieval accuracy on Natural Questions and TriviaQA datasets. We also explore two approaches for end-to-end supervised training of the reader and retriever components in OpenQA models. In the first approach, the reader considers each retrieved document separately while in the second approach, the reader considers all the retrieved documents together. Our experiments demonstrate the effectiveness of these approaches as we obtain new state-of-the-art results. On the Natural Questions dataset, we obtain a top-20 retrieval accuracy of 84, an improvement of 5 points over the recent DPR model. In addition, we achieve good results on answer extraction, outperforming recent models like REALM and RAG by 3+ points. We further scale up end-to-end training to large models and show consistent gains in performance over smaller models.

artificial intelligence, natural language, retriever, (17 more...)

arXiv.org Artificial Intelligence

2101.00408

Country:

Europe > Italy (0.28)
North America > United States (0.28)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Add feedback

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

Hu, Zhiting, Shi, Haoran, Yang, Zichao, Tan, Bowen, Zhao, Tiancheng, He, Junxian, Wang, Wentao, Yu, Xingjiang, Qin, Lianhui, Wang, Di, Ma, Xuezhe, Liu, Hector, Liang, Xiaodan, Zhu, Wanrong, Sachan, Devendra Singh, Xing, Eric P.

arXiv.org Artificial IntelligenceSep-4-2018

We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks that transforms any inputs into natural language, such as machine translation, summarization, dialog, content manipulation, and so forth. With the design goals of modularity, versatility, and extensibility in mind, Texar extracts common patterns underlying the diverse tasks and methodologies, creates a library of highly reusable modules and functionalities, and allows arbitrary model architectures and algorithmic paradigms. In Texar, model architecture, losses, and learning processes are fully decomposed. Modules at high concept level can be freely assembled or plugged in/swapped out. These features make Texar particularly suitable for researchers and practitioners to do fast prototyping and experimentation, as well as foster technique sharing across different text generation tasks. We provide case studies to demonstrate the use and advantage of the toolkit. Texar is released under Apache license 2.0 at https://github.com/asyml/texar.

arxiv preprint arxiv, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1809.00794

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

Sports Video Classification from Multimodal Information Using Deep Neural Networks

Sachan, Devendra Singh (Indian Institute of Technology, Guwahati) | Tekwani, Umesh (Indian Institute of Technology, Guwahati) | Sethi, Amit (Indian Institute of Technology, Guwahati)

AAAI ConferencesNov-14-2013

The work presents a methodology for classification of sports videos using both audio and visual information by applying deep learning algorithms. We show a methodology to combine multiple deep learning architectures through higher layers. Our method learns two separate models trained on audio and visual part of the data. We have trained the model for the audio part of the multimedia input using two stacked layers of CRBMs forminga CDBN. We also train two layered ISA network to extract features from video part of the data. We then train deep stacked autoencoder over both audio and visual features with discriminative fine tuning. Our results show that by combining both audio and visual features we get better accuracy as compared to single type of features.

deep neural network, multimodal information, sport video classification

AAAI Conferences

2013 AAAI Fall Symposium Series

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback