AITopics | Raparthy, Sharath Chandra

Collaborating Authors

Raparthy, Sharath Chandra

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Havrilla, Alex, Dai, Andrew, O'Mahony, Laura, Oostermeijer, Koen, Zisler, Vera, Albalak, Alon, Milo, Fabrizio, Raparthy, Sharath Chandra, Gandhi, Kanishk, Abbasi, Baber, Phung, Duy, Iyer, Maia, Mahan, Dakota, Blagden, Chase, Gureja, Srishti, Hamdy, Mohammed, Li, Wen-Ding, Paolini, Giovanni, Ammanamanchi, Pawan Sasanka, Meyerson, Elliot

arXiv.org Artificial IntelligenceDec-9-2024

Synthetic data generation with Large Language Models is a promising paradigm for augmenting natural data over a nearly infinite range of tasks. Given this variety, direct comparisons among synthetic data generation algorithms are scarce, making it difficult to understand where improvement comes from and what bottlenecks exist. We propose to evaluate algorithms via the makeup of synthetic data generated by each algorithm in terms of data quality, diversity, and complexity. We choose these three characteristics for their significance in open-ended processes and the impact each has on the capabilities of downstream models. We find quality to be essential for in-distribution model generalization, diversity to be essential for out-of-distribution generalization, and complexity to be beneficial for both. Further, we emphasize the existence of Quality-Diversity trade-offs in training data and the downstream effects on model performance. We then examine the effect of various components in the synthetic data pipeline on each data characteristic. This examination allows us to taxonomize and compare synthetic data generation algorithms through the components they utilize and the resulting effects on data QDC composition. This analysis extends into a discussion on the importance of balancing QDC in synthetic data for efficient reinforcement learning and self-improvement algorithms. Analogous to the QD trade-offs in training data, often there exist trade-offs between model output quality and output diversity which impact the composition of synthetic data. We observe that many models are currently evaluated and optimized only for output quality, thereby limiting output diversity and the potential for self-improvement. We argue that balancing these trade-offs is essential to the development of future self-improvement algorithms and highlight a number of works making progress in this direction.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2412.0298

Country:

Europe (1.00)
Asia (0.92)
North America > United States (0.67)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Education (1.00)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Teaching Large Language Models to Reason with Reinforcement Learning

Havrilla, Alex, Du, Yuqing, Raparthy, Sharath Chandra, Nalmpantis, Christoforos, Dwivedi-Yu, Jane, Zhuravinskyi, Maksym, Hambro, Eric, Sukhbaatar, Sainbayar, Raileanu, Roberta

arXiv.org Artificial IntelligenceMar-7-2024

Simultaneously, Reinforcement Learning from Human Feedback (RLHF) (Bai et al., 2022; Ziegler et al., 2019; Ouyang et al., 2022) and instruction fine-tuning (Wei et al., 2021; Mishra et al., 2021) have made significant progress in aligning LLMs with human preferences. Improvements in model instructability have further increased apparent model capability by making complex behaviors more accessible via instruction prompting. This has led to a number of increasingly sophisticated prompting strategies augmenting LLM reasoning capabilities such as Chain-of-Thought (Wei et al., 2022) or Tree-of-Thoughts (Yao et al., 2023). Previous work in reinforcement learning (RL) such as AlphaGo (Silver et al., 2017), AlphaStar (Vinyals et al., 2019), and OpenAI Dota 2 (Berner et al., 2019) demonstrate that RL techniques can be used to train neural networks capable of sophisticated planning and reasoning in game environments. Cicero (Bakhtin et al., 2022) in particular succeeds in combining an RL trained planning agent with a dialogue fine-tuned LLM to achieve nearly super-human performance in the board game Diplomacy. Given these previous successes and the inherent interactive nature of problem solving, applying RL to LLM reasoning seems a natural next step. In this paper, we study how ideas from RL can be used to improve the reasoning capabilities of LLMs across a variety of reward schemes and model initializations. We begin by comparing the performance of different RL algorithms on reasoning tasks τ defined as a distribution of question answer tuples (Q, A).

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2403.04642

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Samvelyan, Mikayel, Raparthy, Sharath Chandra, Lupu, Andrei, Hambro, Eric, Markosyan, Aram H., Bhatt, Manish, Mao, Yuning, Jiang, Minqi, Parker-Holder, Jack, Foerster, Jakob, Rocktäschel, Tim, Raileanu, Roberta

arXiv.org Artificial IntelligenceFeb-26-2024

Large language models (LLMs) have recently experienced remarkable growth in both their capabilities (OpenAI, 2023; Gemini Team et al., 2023; Touvron et al., 2023) and their applications in various fields (NLLB Team et al., 2022; Thirunavukarasu et al., 2023; Schick et al., 2023; Bubeck et al., 2023). As LLMs become increasingly complex and are deployed in safety-critical environments (Singhal et al., 2022; Li et al., 2023; Maddela et al., 2023), it is essential to thoroughly understand their robustness to different inputs. Indeed, the susceptibility of LLMs to user inputs and adversarial prompts -- prompts crafted to mislead the model or exploit its weaknesses, potentially leading to unsafe, biased, or incorrect outputs -- poses a significant challenge (Perez et al., 2022; Wei et al., 2023; Zou et al., 2023). Identifying these vulnerabilities and subsequently mitigating such risks is therefore vital to ensure the safe and reliable operation of LLMs in the real world. Current methods for identifying adversarial prompts aimed at "attacking" LLMs and eliciting undesirable outputs are limited by several factors.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.16822

Country:

North America > Canada (0.14)
North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
Law (0.93)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Generalization to New Sequential Decision Making Tasks with In-Context Learning

Raparthy, Sharath Chandra, Hambro, Eric, Kirk, Robert, Henaff, Mikael, Raileanu, Roberta

arXiv.org Artificial IntelligenceDec-6-2023

Training autonomous agents that can learn new tasks from only a handful of demonstrations is a long-standing problem in machine learning. Recently, transformers have been shown to learn new language or vision tasks without any weight updates from only a few examples, also referred to as in-context learning. However, the sequential decision making setting poses additional challenges having a lower tolerance for errors since the environment's stochasticity or the agent's actions can lead to unseen, and sometimes unrecoverable, states. In this paper, we use an illustrative example to show that naively applying transformers to sequential decision making problems does not enable in-context learning of new tasks. We then demonstrate how training on sequences of trajectories with certain distributional properties leads to in-context learning of new sequential decision making tasks. We investigate different design choices and find that larger model and dataset sizes, as well as more task diversity, environment stochasticity, and trajectory burstiness, all result in better in-context learning of new out-of-distribution tasks. By training on large diverse offline datasets, our model is able to learn new MiniHack and Procgen tasks without any weight updates from just a handful of demonstrations.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2312.03801

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.66)
Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multi-Objective GFlowNets

Jain, Moksh, Raparthy, Sharath Chandra, Hernandez-Garcia, Alex, Rector-Brooks, Jarrid, Bengio, Yoshua, Miret, Santiago, Bengio, Emmanuel

arXiv.org Artificial IntelligenceJul-17-2023

We study the problem of generating diverse candidates in the context of Multi-Objective Optimization. In many applications of machine learning such as drug discovery and material design, the goal is to generate candidates which simultaneously optimize a set of potentially conflicting objectives. Moreover, these objectives are often imperfect evaluations of some underlying property of interest, making it important to generate diverse candidates to have multiple options for expensive downstream evaluations. We propose Multi-Objective GFlowNets (MOGFNs), a novel method for generating diverse Pareto optimal solutions, based on GFlowNets. We introduce two variants of MOGFNs: MOGFN-PC, which models a family of independent sub-problems defined by a scalarization function, with reward-conditional GFlowNets, and MOGFN-AL, which solves a sequence of sub-problems defined by an acquisition function in an active learning loop. Our experiments on wide variety of synthetic and benchmark tasks demonstrate advantages of the proposed methods in terms of the Pareto performance and importantly, improved candidate diversity, which is the main contribution of this work.

artificial intelligence, machine learning, objective, (17 more...)

arXiv.org Artificial Intelligence

2210.12765

Country:

North America > Canada > Quebec (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback