AITopics | autonomous reinforcement learning

Autonomous Reinforcement Learning via Subgoal Curricula

Neural Information Processing SystemsDec-24-2025, 13:46:32 GMT

Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents. However, the success of current reinforcement learning algorithms is predicated on an often under-emphasised requirement -- each trial needs to start from a fixed initial state distribution. Unfortunately, resetting the environment to its initial state after each trial requires substantial amount of human supervision and extensive instrumentation of the environment which defeats the goal of autonomous acquisition of complex behaviors. In this work, we propose Value-accelerated Persistent Reinforcement Learning (VaPRL), which generates a curriculum of initial states such that the agent can bootstrap on the success of easier tasks to efficiently learn harder tasks. The agent also learns to reach the initial states proposed by the curriculum, minimizing the reliance on human interventions into the learning. We observe that VaPRL reduces the interventions required by three orders of magnitude compared to episodic RL while outperforming prior state-of-the art methods for reset-free RL both in terms of sample efficiency and asymptotic performance on a variety of simulated robotics problems.

autonomous reinforcement learning, curriculum, name change, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

Neural Information Processing SystemsDec-24-2025, 09:57:25 GMT

A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world. A critical challenge to such autonomy is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table. While standard agents require constant monitoring to decide when to intervene, we aim to design proactive agents that can request human intervention only when needed. To this end, we propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them. On a suite of continuous control environments with unknown irreversible states, we find that our algorithm exhibits better sample-and intervention-efficiency compared to existing methods.

autonomous reinforcement learning, name change, proactive intervention, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Neural Information Processing SystemsMay-26-2025, 17:13:37 GMT

Pre-trained vision language models (VLMs), though powerful, typically lack training on decision-centric data, rendering them sub-optimal for decision-making tasks such as in-the-wild device control through Graphical User Interfaces (GUIs) when used off-the-shelf. While training with static demonstrations has shown some promise, we show that such methods fall short when controlling real GUIs due to their failure to deal with real world stochasticity and dynamism not captured in static observational data. This paper introduces a novel autonomous RL approach, called DigiRL, for training in-the-wild device control agents through fine-tuning a pre-trained VLM in two stages: offline and offline-to-online RL. We first build a scalable and parallelizable Android learning environment equipped with a VLM-based general-purpose evaluator and then identify the key design choices for simple and effective RL in this domain. We demonstrate the effectiveness of DigiRL using the Android-in-the-Wild (AitW) dataset, where our 1.5B

artificial intelligence, human computer interaction, machine learning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Human Computer Interaction > Interfaces (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Autonomous Reinforcement Learning via Subgoal Curricula

Neural Information Processing SystemsJan-18-2025, 00:23:29 GMT

Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents. However, the success of current reinforcement learning algorithms is predicated on an often under-emphasised requirement -- each trial needs to start from a fixed initial state distribution. Unfortunately, resetting the environment to its initial state after each trial requires substantial amount of human supervision and extensive instrumentation of the environment which defeats the goal of autonomous acquisition of complex behaviors. In this work, we propose Value-accelerated Persistent Reinforcement Learning (VaPRL), which generates a curriculum of initial states such that the agent can bootstrap on the success of easier tasks to efficiently learn harder tasks. The agent also learns to reach the initial states proposed by the curriculum, minimizing the reliance on human interventions into the learning. We observe that VaPRL reduces the interventions required by three orders of magnitude compared to episodic RL while outperforming prior state-of-the art methods for reset-free RL both in terms of sample efficiency and asymptotic performance on a variety of simulated robotics problems.

autonomous reinforcement learning, curriculum, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

Neural Information Processing SystemsOct-11-2024, 11:51:41 GMT

A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world. A critical challenge to such autonomy is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table. While standard agents require constant monitoring to decide when to intervene, we aim to design proactive agents that can request human intervention only when needed. To this end, we propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them. On a suite of continuous control environments with unknown irreversible states, we find that our algorithm exhibits better sample- and intervention-efficiency compared to existing methods.

autonomous reinforcement learning, irreversible state, proactive intervention, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)
Information Technology > Artificial Intelligence > Robots (0.66)

Add feedback

Collaborating Authors

autonomous reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Autonomous Reinforcement Learning via Subgoal Curricula

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Autonomous Reinforcement Learning via Subgoal Curricula

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning