AITopics | Rajagopal, Dheeraj

Collaborating Authors

Rajagopal, Dheeraj

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scalable Influence and Fact Tracing for Large Language Model Pretraining

Chang, Tyler A., Rajagopal, Dheeraj, Bolukbasi, Tolga, Dixon, Lucas, Tenney, Ian

arXiv.org Artificial IntelligenceDec-20-2024

Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples, and the application of these methods to large language model (LLM) outputs could significantly advance model transparency and data curation. However, it has been challenging to date to apply these methods to the full scale of LLM pretraining. In this paper, we refine existing gradient-based methods to work effectively at scale, allowing us to retrieve influential examples for an 8B-parameter language model from a pretraining corpus of over 160B tokens with no need for subsampling or pre-filtering. Our method combines several techniques, including optimizer state correction, a task-specific Hessian approximation, and normalized encodings, which we find to be critical for performance at scale. In quantitative evaluations on a fact tracing task, our method performs best at identifying examples that influence model predictions, but classical, model-agnostic retrieval methods such as BM25 still perform better at finding passages which explicitly contain relevant facts. These results demonstrate a misalignment between factual *attribution* and causal *influence*. With increasing model size and training tokens, we find that influence more closely aligns with factual attribution. Finally, we examine different types of examples identified as influential by our method, finding that while many directly entail a particular fact, others support the same output by reinforcing priors on relation types, common entities, and names. We release our prompt set and model outputs, along with a web-based visualization tool to explore influential examples for factual predictions, commonsense reasoning, arithmetic, and open-ended generation for an 8B-parameter LLM.

large language model, machine learning, proponent, (20 more...)

arXiv.org Artificial Intelligence

2410.17413

Country:

Europe > Netherlands > North Holland (0.14)
Asia > Middle East > UAE (0.14)
North America > United States > New York (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.87)

Industry:

Media > Music (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)
Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Confidence Calibration and Rationalization for LLMs via Multi-Agent Deliberation

Yang, Ruixin, Rajagopal, Dheeraj, Hayati, Shirley Anugrah, Hu, Bin, Kang, Dongyeop

arXiv.org Artificial IntelligenceMay-10-2024

Uncertainty estimation is a significant issue for current large language models (LLMs) that are generally poorly calibrated and over-confident, especially with reinforcement learning from human feedback (RLHF). Unlike humans, whose decisions and confidences not only stem from intrinsic beliefs but can also be adjusted through daily observations, existing calibration methods for LLMs focus on estimating or eliciting individual confidence without taking full advantage of the "Collective Wisdom": the interaction among multiple LLMs that can collectively improve both accuracy and calibration. In this work, we propose Collaborative Calibration, a post-hoc training-free calibration strategy that leverages the collaborative and expressive capabilities of multiple tool-augmented LLM agents in a simulated group deliberation process. While contemporary large language models (LLMs) have achieved remarkable performance in a variety of tasks ranging from question answering to complex reasoning (Brown et al., 2020; Bubeck et al., 2023), it remains a significant bottleneck for them to produce well-calibrated confidence estimates for their predictions, meaning that their individual confidence is not a reliable indicator of accuracy. Models still often generate hallucinations (Bubeck et al., 2023) or wildly wrong predictions, unknowingly and over-confidently, which is found to be more evident for models fine-tuned with RLHF (Kadavath et al., 2022; Tian et al., 2023). On the other hand, models can exhibit inconsistencies and lack of confidence, by blindly altering decisions and prioritizing incorrect user opinions (Wei et al., 2023).

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.09127

Country:

Asia (0.68)
North America > Canada (0.46)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How Far Can We Extract Diverse Perspectives from Large Language Models? Criteria-Based Diversity Prompting!

Hayati, Shirley Anugrah, Lee, Minhwa, Rajagopal, Dheeraj, Kang, Dongyeop

arXiv.org Artificial IntelligenceNov-16-2023

Collecting diverse human data on subjective NLP topics is costly and challenging. As Large Language Models (LLMs) have developed human-like capabilities, there is a recent trend in collaborative efforts between humans and LLMs for generating diverse data, offering potential scalable and efficient solutions. However, the extent of LLMs' capability to generate diverse perspectives on subjective topics remains an unexplored question. In this study, we investigate LLMs' capacity for generating diverse perspectives and rationales on subjective topics, such as social norms and argumentative texts. We formulate this problem as diversity extraction in LLMs and propose a criteria-based prompting technique to ground diverse opinions and measure perspective diversity from the generated criteria words. Our results show that measuring semantic diversity through sentence embeddings and distance metrics is not enough to measure perspective diversity. To see how far we can extract diverse perspectives from LLMs, or called diversity coverage, we employ a step-by-step recall prompting for generating more outputs from the model in an iterative manner. As we apply our prompting method to other tasks (hate speech labeling and story continuation), indeed we find that LLMs are able to generate diverse opinions according to the degree of task subjectivity.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.09799

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Educational Setting (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

AutoMix: Automatically Mixing Language Models

Madaan, Aman, Aggarwal, Pranjal, Anand, Ankit, Potharaju, Srividya Pranavi, Mishra, Swaroop, Zhou, Pei, Gupta, Aditya, Rajagopal, Dheeraj, Kappaganthu, Karthik, Yang, Yiming, Upadhyay, Shyam, Mausam, null, Faruqui, Manaal

arXiv.org Artificial IntelligenceNov-15-2023

Large language models (LLMs) are now available in various sizes and configurations from cloud API providers. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix is a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring training. Given that verifications can be noisy, we employ a meta verifier in AutoMix to refine the accuracy of these assessments. Our experiments using LLAMA2-13/70B, on five context-grounded reasoning datasets demonstrate that AutoMix surpasses established baselines, improving the incremental benefit per cost by up to 89%. Our code and data are available at https://github.com/automix-llm/automix.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.12963

Country:

Asia > Middle East > UAE (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

Add feedback

StyLEx: Explaining Style Using Human Lexical Annotations

Hayati, Shirley Anugrah, Park, Kyumin, Rajagopal, Dheeraj, Ungar, Lyle, Kang, Dongyeop

arXiv.org Artificial IntelligenceApr-14-2023

Large pre-trained language models have achieved impressive results on various style classification tasks, but they often learn spurious domain-specific words to make predictions (Hayati et al., 2021). While human explanation highlights stylistic tokens as important features for this task, we observe that model explanations often do not align with them. To tackle this issue, we introduce StyLEx, a model that learns from human-annotated explanations of stylistic features and jointly learns to perform the task and predict these features as model explanations. Our experiments show that StyLEx can provide human-like stylistic lexical explanations without sacrificing the performance of sentence-level style prediction on both in-domain and out-of-domain datasets. Explanations from StyLEx show significant improvements in explanation metrics (sufficiency, plausibility) and when evaluated with human annotations. They are also more understandable by human judges compared to the widely-used saliency-based explanation baseline.

explanation, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2210.07469

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Cross-Domain Reasoning via Template Filling

Rajagopal, Dheeraj, Khetan, Vivek, Sacaleanu, Bogdan, Gershman, Anatole, Fano, Andrew, Hovy, Eduard

arXiv.org Artificial IntelligenceOct-31-2021

In this paper, we explore the ability of sequence to sequence models to perform cross-domain reasoning. Towards this, we present a prompt-template-filling approach to enable sequence to sequence models to perform cross-domain reasoning. We also present a case-study with commonsense and health and well-being domains, where we study how prompt-template-filling enables pretrained sequence to sequence models across domains. Our experiments across several pretrained encoder-decoder models show that cross-domain reasoning is challenging for current models. We also show an in-depth error analysis and avenues for future research for reasoning across domains

consumer health, knowledge management, machine learning, (25 more...)

arXiv.org Artificial Intelligence

2111.00539

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Knowledge Management > Knowledge Engineering (0.94)

Add feedback

Think about it! Improving defeasible reasoning by first modeling the question scenario

Madaan, Aman, Tandon, Niket, Rajagopal, Dheeraj, Clark, Peter, Yang, Yiming, Hovy, Eduard

arXiv.org Artificial IntelligenceOct-24-2021

Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence. Existing cognitive science literature on defeasible reasoning suggests that a person forms a mental model of the problem scenario before answering questions. Our research goal asks whether neural models can similarly benefit from envisioning the question scenario before answering a defeasible query. Our approach is, given a question, to have a model first create a graph of relevant influences, and then leverage that graph as an additional input when answering the question. Our system, CURIOUS, achieves a new state-of-the-art on three different defeasible reasoning datasets. This result is significant as it illustrates that performance can be improved by guiding a system to "think about" a question and explicitly model the scenario, rather than answering reflexively. Code, data, and pre-trained models are located at https://github.com/madaan/thinkaboutit.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2110.12349

Country:

Europe (1.00)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.88)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Could you give me a hint? Generating inference graphs for defeasible reasoning

Madaan, Aman, Rajagopal, Dheeraj, Tandon, Niket, Yang, Yiming, Hovy, Eduard

arXiv.org Artificial IntelligenceMay-12-2021

Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence. A commonly used method in philosophy and AI literature is to handcraft argumentation supporting inference graphs. While humans find inference graphs very useful for reasoning, constructing them at scale is difficult. In this paper, we automatically generate such inference graphs through transfer learning from another NLP task that shares the kind of reasoning that inference graphs support. Through automated metrics and human evaluation, we find that our method generates meaningful graphs for the defeasible inference task. Human accuracy on this task improves by 20% by consulting the generated graphs. Our findings open up exciting new research avenues for cases where machine reasoning can help human reasoning. (A dataset of 230,000 influence graphs for each defeasible query is located at: https://tinyurl.com/defeasiblegraphs.)

artificial intelligence, graph, natural language, (18 more...)

arXiv.org Artificial Intelligence

2105.05418

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Gated-Attention Architectures for Task-Oriented Language Grounding

Chaplot, Devendra Singh (Carnegie Mellon University) | Sathyendra, Kanthashree Mysore (Carnegie Mellon University, Language Technologies Institute) | Pasumarthi, Rama Kumar (Carnegie Mellon University, Language Technologies Institute) | Rajagopal, Dheeraj (Carnegie Mellon University, Language Technologies Institute) | Salakhutdinov, Ruslan (Carnegie Mellon University)

AAAI ConferencesFeb-8-2018

To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input. The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation learning methods. We show the effectiveness of the proposed model on unseen instructions as well as unseen maps, both quantitatively and qualitatively. We also introduce a novel environment based on a 3D game engine to simulate the challenges of task-oriented language grounding over a rich set of instructions and environment states.

deep learning, instruction, neural network, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Instructional Material (0.47)

Technology: