AITopics | Dasagi, Vibhavari

Collaborating Authors

Dasagi, Vibhavari

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers

Gemp, Ian, Bachrach, Yoram, Lanctot, Marc, Patel, Roma, Dasagi, Vibhavari, Marris, Luke, Piliouras, Georgios, Liu, Siqi, Tuyls, Karl

arXiv.org Artificial IntelligenceFeb-6-2024

Game theory is the study of mathematical models of strategic interactions among rational agents. Language is a key medium of interaction for humans, though it has historically proven difficult to model dialogue and its strategic motivations mathematically. A suitable model of the players, strategies, and payoffs associated with linguistic interactions (i.e., a binding to the conventional symbolic logic of game theory) would enable existing game-theoretic algorithms to provide strategic solutions in the space of language. In other words, a binding could provide a route to computing stable, rational conversational strategies in dialogue. Large language models (LLMs) have arguably reached a point where their generative capabilities can enable realistic, human-like simulations of natural dialogue. By prompting them in various ways, we can steer their responses towards different output utterances. Leveraging the expressivity of natural language, LLMs can also help us quickly generate new dialogue scenarios, which are grounded in real world applications. In this work, we present one possible binding from dialogue to game theory as well as generalizations of existing equilibrium finding algorithms to this setting. In addition, by exploiting LLMs generation capabilities along with our proposed binding, we can synthesize a large repository of formally-defined games in which one can study and test game-theoretic solution concepts. We also demonstrate how one can combine LLM-driven game generation, game-theoretic solvers, and imitation learning to construct a process for improving the strategic capabilities of LLMs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.01704

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Human-Timescale Adaptation in an Open-Ended Task Space

Adaptive Agent Team, null, Bauer, Jakob, Baumli, Kate, Baveja, Satinder, Behbahani, Feryal, Bhoopchand, Avishkar, Bradley-Schmieg, Nathalie, Chang, Michael, Clay, Natalie, Collister, Adrian, Dasagi, Vibhavari, Gonzalez, Lucy, Gregor, Karol, Hughes, Edward, Kashem, Sheleem, Loks-Thompson, Maria, Openshaw, Hannah, Parker-Holder, Jack, Pathak, Shreya, Perez-Nieves, Nicolas, Rakicevic, Nemanja, Rocktäschel, Tim, Schroecker, Yannick, Sygnowski, Jakub, Tuyls, Karl, York, Sarah, Zacherl, Alexander, Zhang, Lei

arXiv.org Artificial IntelligenceJan-18-2023

Foundation models have shown impressive adaptation and scalability in supervised and self-supervised learning problems, but so far these successes have not fully translated to reinforcement learning (RL). In this work, we demonstrate that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans. In a vast space of held-out environment dynamics, our adaptive agent (AdA) displays on-the-fly hypothesis-driven exploration, efficient exploitation of acquired knowledge, and can successfully be prompted with first-person demonstrations. Adaptation emerges from three ingredients: (1) meta-reinforcement learning across a vast, smooth and diverse task distribution, (2) a policy parameterised as a large-scale attention-based memory architecture, and (3) an effective automated curriculum that prioritises tasks at the frontier of an agent's capabilities. We demonstrate characteristic scaling laws with respect to network size, memory length, and richness of the training task distribution. We believe our results lay the foundation for increasingly general and adaptive RL agents that perform well across ever-larger open-ended domains.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2301.07608

Country:

North America > United States (1.00)
Europe (0.92)

Genre: Research Report > New Finding (0.87)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

Zero-Shot Uncertainty-Aware Deployment of Simulation Trained Policies on Real-World Robots

Rana, Krishan, Dasagi, Vibhavari, Haviland, Jesse, Talbot, Ben, Milford, MIchael, Sünderhauf, Niko

arXiv.org Artificial IntelligenceDec-9-2021

While deep reinforcement learning (RL) agents have demonstrated incredible potential in attaining dexterous behaviours for robotics, they tend to make errors when deployed in the real world due to mismatches between the training and execution environments. In contrast, the classical robotics community have developed a range of controllers that can safely operate across most states in the real world given their explicit derivation. These controllers however lack the dexterity required for complex tasks given limitations in analytical modelling and approximations. In this paper, we propose Bayesian Controller Fusion (BCF), a novel uncertainty-aware deployment strategy that combines the strengths of deep RL policies and traditional handcrafted controllers. In this framework, we can perform zero-shot sim-to-real transfer, where our uncertainty based formulation allows the robot to reliably act within out-of-distribution states by leveraging the handcrafted controller while gaining the dexterity of the learned system otherwise. We show promising results on two real-world continuous control tasks, where BCF outperforms both the standalone policy and controller, surpassing what either can achieve independently.

controller, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2112.05299

Genre: Research Report (0.40)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics

Rana, Krishan, Dasagi, Vibhavari, Haviland, Jesse, Talbot, Ben, Milford, Michael, Sünderhauf, Niko

arXiv.org Artificial IntelligenceJul-22-2021

We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs from each system, BCF arbitrates control between them, exploiting their respective strengths. We study BCF on two real-world robotics tasks involving navigation in a vast and long-horizon environment, and a complex reaching task that involves manipulability maximisation. For both these domains, there exist simple handcrafted controllers that can solve the task at hand in a risk-averse manner but do not necessarily exhibit the optimal solution given limitations in analytical modelling, controller miscalibration and task variation. As exploration is naturally guided by the prior in the early stages of training, BCF accelerates learning, while substantially improving beyond the performance of the control prior, as the policy gains more experience. More importantly, given the risk-aversity of the control prior, BCF ensures safe exploration and deployment, where the control prior naturally dominates the action distribution in states unknown to the policy. We additionally show BCF's applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real-world. BCF is a promising approach for combining the complementary strengths of deep RL and traditional robotic control, surpassing what either can achieve independently. The code and supplementary video material are made publicly available at https://krishanrana.github.io/bcf.

controller, survey article, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

2107.09822

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Ctrl-Z: Recovering from Instability in Reinforcement Learning

Dasagi, Vibhavari, Bruce, Jake, Peynot, Thierry, Leitner, Jürgen

arXiv.org Machine LearningOct-8-2019

When learning behavior, training data is often generated by the learner itself; this can result in unstable training dynamics, and this problem has particularly important applications in safety-sensitive real-world control tasks such as robotics. In this work, we propose a principled and model-agnostic approach to mitigate the issue of unstable learning dynamics by maintaining a history of a reinforcement learning agent over the course of training, and reverting to the parameters of a previous agent whenever performance significantly decreases. We develop techniques for evaluating this performance through statistical hypothesis testing of continued improvement, and evaluate them on a standard suite of challenging benchmark tasks involving continuous control of simulated robots. We show improvements over state-of-the-art reinforcement learning algorithms in performance and robustness to hyperparameters, outperforming DDPG in 5 out of 6 evaluation environments and showing no decrease in performance with TD3, which is known to be relatively stable. In this way, our approach takes an important step towards increasing data efficiency and stability in training for real-world robotic applications.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1910.03732

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Sim-to-Real Transfer of Robot Learning with Variable Length Inputs

Dasagi, Vibhavari, Lee, Robert, Mou, Serena, Bruce, Jake, Sünderhauf, Niko, Leitner, Jürgen

arXiv.org Machine LearningOct-8-2019

Current end-to-end deep Reinforcement Learning (RL) approaches require jointly learning perception, decision-making and low-level control from very sparse reward signals and high-dimensional inputs, with little capability of incorporating prior knowledge. This results in prohibitively long training times for use on real-world robotic tasks. Existing algorithms capable of extracting task-level representations from high-dimensional inputs, e.g. object detection, often produce outputs of varying lengths, restricting their use in RL methods due to the need for neural networks to have fixed length inputs. In this work, we propose a framework that combines deep sets encoding, which allows for variable-length abstract representations, with modular RL that utilizes these representations, decoupling high-level decision making from low-level control. We successfully demonstrate our approach on the robot manipulation task of object sorting, showing that this method can learn effective policies within mere minutes of highly simplified simulation. The learned policies can be directly deployed on a robot without further training, and generalize to variations of the task unseen during training.

artificial intelligence, neural network, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1809.0748

Country: Oceania > Australia > Queensland (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback