AITopics | external action

Collaborating Authors

external action

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy

Yang, Zonghan, Li, Peng, Yan, Ming, Zhang, Ji, Huang, Fei, Liu, Yang

arXiv.org Artificial IntelligenceApr-1-2024

Language agents have demonstrated autonomous decision-making abilities by reasoning with foundation models. Recently, efforts have been made to train language agents for performance improvement, with multi-step reasoning and action trajectories as the training data. However, collecting such trajectories still requires considerable human effort, by either artificial annotation or implementations of diverse prompting frameworks. In this work, we propose A$^3$T, a framework that enables the Autonomous Annotation of Agent Trajectories in the style of ReAct. The central role is an ActRe prompting agent, which explains the reason for an arbitrary action. When randomly sampling an external action, the ReAct-style agent could query the ActRe agent with the action to obtain its textual rationales. Novel trajectories are then synthesized by prepending the posterior reasoning from ActRe to the sampled action. In this way, the ReAct-style agent executes multiple trajectories for the failed tasks, and selects the successful ones to supplement its failed trajectory for contrastive self-training. Realized by policy gradient methods with binarized rewards, the contrastive self-training with accumulated trajectories facilitates a closed loop for multiple rounds of language agent self-improvement. We conduct experiments using QLoRA fine-tuning with the open-sourced Mistral-7B-Instruct-v0.2. In AlfWorld, the agent trained with A$^3$T obtains a 1-shot success rate of 96%, and 100% success with 4 iterative rounds. In WebShop, the 1-shot performance of the A$^3$T agent matches human average, and 4 rounds of iterative refinement lead to the performance approaching human experts. A$^3$T agents significantly outperform existing techniques, including prompting with GPT-4, advanced agent frameworks, and fully fine-tuned LLMs.

agent, arxiv preprint arxiv, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2403.14589

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Montserrat (0.04)
North America > Dominican Republic (0.04)

Genre: Research Report (0.64)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Why Don't You Do Something About It? Outlining Connections between AI Explanations and User Actions

Mansi, Gennie, Riedl, Mark

arXiv.org Artificial IntelligenceMay-10-2023

A core assumption of explainable AI systems is that explanations change what users know, thereby enabling them to act within their complex socio-technical environments. Despite the centrality of action, explanations are often organized and evaluated based on technical aspects. Prior work varies widely in the connections it traces between information provided in explanations and resulting user actions. An important first step in centering action in evaluations is understanding what the XAI community collectively recognizes as the range of information that explanations can present and what actions are associated with them. In this paper, we present our framework, which maps prior work on information presented in explanations and user action, and we discuss the gaps we uncovered about the information presented to users.

information, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2305.06297

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Georgia > Fulton County > Atlanta (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.91)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.70)

Add feedback

Reinforcement Learning with Information-Theoretic Actuation

Catt, Elliot, Hutter, Marcus, Veness, Joel

arXiv.org Artificial IntelligenceSep-30-2021

Reinforcement Learning formalises an embodied agent's interaction with the environment through observations, rewards and actions. But where do the actions come from? Actions are often considered to represent something external, such as the movement of a limb, a chess piece, or more generally, the output of an actuator. In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model. This view is particularly well-suited for leveraging the recent advances in large sequence models as prior knowledge for multi-task reinforcement learning problems. Our main contribution in this work is to show how to augment the standard MDP formalism with a sequential notion of internal action using information-theoretic techniques, and that this leads to self-consistent definitions of both internal and external action value functions.

action model, action space, external action, (12 more...)

arXiv.org Artificial Intelligence

2109.15147

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Chess (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bayesian Policy Search with Policy Priors

Wingate, David (Massachusetts Institute of Technology) | Goodman, Noah D. (Stanford University) | Roy, Daniel M. (Massachusetts Institute of Technology) | Kaelbling, Leslie P. (Massachusetts Institute of Technology) | Tenenbaum, Joshua B. (Massachusetts Institute of Technology)

AAAI ConferencesJul-19-2011

We consider the problem of learning to act in partially observable, continuous-state-and-action worlds where we have abstract prior knowledge about the structure of the optimal policy in the form of a distribution over policies. Using ideas from planning-as-inference reductions and Bayesian unsupervised learning, we cast Markov Chain Monte Carlo as a stochastic, hill-climbing policy search algorithm. Importantly, this algorithm's search bias is directly tied to the prior and its MCMC proposal kernels, which means we can draw on the full Bayesian toolbox to express the search bias, including nonparametric priors and structured, recursive processes like grammars over action sequences. Furthermore, we can reason about uncertainty in the search bias itself by constructing a hierarchical prior and reasoning about latent variables that determine the abstract structure of the policy. This yields an adaptive search algorithm---our algorithm learns to learn a structured policy efficiently. We show how inference over the latent variables in these policy priors enables intra- and intertask transfer of abstract knowledge. We demonstrate the flexibility of this approach by learning meta search biases, by constructing a nonparametric finite state controller to model memory, by discovering motor primitives using a simple grammar over primitive actions, and by combining all three.

algorithm, maze, motor primitive, (14 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Modeling Interventions Using Belief Causal Networks

Boukhris, Imen (LARODEC - Universite de Tunis) | Elouedi, Zied (LARODEC - Universite de Tunis) | Benferhat, Salem (CRIL - Universite d'Artois)

AAAI ConferencesMay-18-2011

Causality plays an important role in our comprehension of the world. It amounts to determine what truly causes what and what it matters. Interventions allow the identification of elements in a sequence of events that are related in a causal way. In this paper, we introduce belief causation and we proposea method for handling interventions in graphical model under an uncertain environment where the uncertainty is represented by belief masses, so-called belief causal networks. More specifically, we propose a generalization of the “DO” operator and explain the needed changes on the structure of the graph to model a belief causal network on which interventions are proceeded.

causal network, extension, intervention, (14 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country:

Africa > Middle East > Tunisia > Tunis Governorate > Tunis (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback