AITopics | ensembledagger

Collaborating Authors

ensembledagger

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CubeDAgger: Improved Robustness of Interactive Imitation Learning without Violation of Dynamic Stability

Kobayashi, Taisuke

arXiv.org Artificial IntelligenceMay-9-2025

Interactive imitation learning makes an agent's control policy robust by stepwise supervisions from an expert. The recent algorithms mostly employ expert-agent switching systems to reduce the expert's burden by limitedly selecting the supervision timing. However, the precise selection is difficult and such a switching causes abrupt changes in actions, damaging the dynamic stability. This paper therefore proposes a novel method, so-called CubeDAgger, which improves robustness while reducing dynamic stability violations by making three improvements to a baseline method, EnsembleDAgger. The first improvement adds a regularization to explicitly activate the threshold for deciding the supervision timing. The second transforms the expert-agent switching system to an optimal consensus system of multiple action candidates. Third, autoregressive colored noise to the actions is introduced to make the stochastic exploration consistent over time. These improvements are verified by simulations, showing that the learned policies are sufficiently robust while maintaining dynamic stability during interaction.

ensembledagger, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2505.04897

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback

Zhao, Michelle, Simmons, Reid, Admoni, Henny, Ramdas, Aaditya, Bajcsy, Andrea

arXiv.org Artificial IntelligenceOct-11-2024

In interactive imitation learning (IL), uncertainty quantification offers a way for the learner (i.e. robot) to contend with distribution shifts encountered during deployment by actively seeking additional feedback from an expert (i.e. human) online. Prior works use mechanisms like ensemble disagreement or Monte Carlo dropout to quantify when black-box IL policies are uncertain; however, these approaches can lead to overconfident estimates when faced with deployment-time distribution shifts. Instead, we contend that we need uncertainty quantification algorithms that can leverage the expert human feedback received during deployment time to adapt the robot's uncertainty online. To tackle this, we draw upon online conformal prediction, a distribution-free method for constructing prediction intervals online given a stream of ground-truth labels. Human labels, however, are intermittent in the interactive IL setting. Thus, from the conformal prediction side, we introduce a novel uncertainty quantification algorithm called intermittent quantile tracking (IQT) that leverages a probabilistic model of intermittent labels, maintains asymptotic coverage guarantees, and empirically achieves desired coverage levels. From the interactive IL side, we develop ConformalDAgger, a new approach wherein the robot uses prediction intervals calibrated by IQT as a reliable measure of deployment-time uncertainty to actively query for more expert feedback. We compare ConformalDAgger to prior uncertainty-aware DAgger methods in scenarios where the distribution shift is (and isn't) present because of changes in the expert's policy. We find that in simulated and hardware deployments on a 7DOF robotic manipulator, ConformalDAgger detects high uncertainty when the expert shifts and increases the number of interventions compared to baselines, allowing the robot to more quickly learn the new behavior.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2410.08852

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.81)

Industry:

Transportation (0.48)
Media > Television (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Leveraging Demonstrator-perceived Precision for Safe Interactive Imitation Learning of Clearance-limited Tasks

Oh, Hanbit, Matsubara, Takamitsu

arXiv.org Artificial IntelligenceFeb-20-2024

Interactive imitation learning is an efficient, model-free method through which a robot can learn a task by repetitively iterating an execution of a learning policy and a data collection by querying human demonstrations. However, deploying unmatured policies for clearance-limited tasks, like industrial insertion, poses significant collision risks. For such tasks, a robot should detect the collision risks and request intervention by ceding control to a human when collisions are imminent. The former requires an accurate model of the environment, a need that significantly limits the scope of IIL applications. In contrast, humans implicitly demonstrate environmental precision by adjusting their behavior to avoid collisions when performing tasks. Inspired by human behavior, this paper presents a novel interactive learning method that uses demonstrator-perceived precision as a criterion for human intervention called Demonstrator-perceived Precision-aware Interactive Imitation Learning (DPIIL). DPIIL captures precision by observing the speed-accuracy trade-off exhibited in human demonstrations and cedes control to a human to avoid collisions in states where high precision is estimated. DPIIL improves the safety of interactive policy learning and ensures efficiency without explicitly providing precise information of the environment. We assessed DPIIL's effectiveness through simulations and real-robot experiments that trained a UR5e 6-DOF robotic arm to perform assembly tasks. Our results significantly improved training safety, and our best performance compared favorably with other learning methods.

intervention, precision, robot, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2024.3366755

2402.13466

Country: Asia > Japan (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.47)
Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback