Goto

Collaborating Authors

 human supervisor


Evaluating Control Protocols for Untrusted AI Agents

arXiv.org Artificial Intelligence

As AI systems become more capable and widely deployed as agents, ensuring their safe operation becomes critical. AI control offers one approach to mitigating the risk from untrusted AI agents by monitoring their actions and intervening or auditing when necessary. Evaluating the safety of these protocols requires understanding both their effectiveness against current attacks and their robustness to adaptive adversaries. In this work, we systematically evaluate a range of control protocols in SHADE-Arena, a dataset of diverse agentic environments. First, we evaluate blue team protocols, including deferral to trusted models, resampling, and deferring on critical actions, against a default attack policy. We find that resampling for incrimination and deferring on critical actions perform best, increasing safety from 50% to 96%. We then iterate on red team strategies against these protocols and find that attack policies with additional affordances, such as knowledge of when resampling occurs or the ability to simulate monitors, can substantially improve attack success rates against our resampling strategy, decreasing safety to 17%. However, deferring on critical actions is highly robust to even our strongest red team strategies, demonstrating the importance of denying attack policies access to protocol internals.


Hierarchical LLMs In-the-loop Optimization for Real-time Multi-Robot Target Tracking under Unknown Hazards

arXiv.org Artificial Intelligence

In this paper, we propose a hierarchical Large Language Models (LLMs) in-the-loop optimization framework for real-time multi-robot task allocation and target tracking in an unknown hazardous environment subject to sensing and communication attacks. We formulate multi-robot coordination for tracking tasks as a bi-level optimization problem, with LLMs to reason about potential hazards in the environment and the status of the robot team and modify both the inner and outer levels of the optimization. The inner LLM adjusts parameters to prioritize various objectives, including performance, safety, and energy efficiency, while the outer LLM handles online variable completion for team reconfiguration. This hierarchical approach enables real-time adjustments to the robots' behavior. Additionally, a human supervisor can offer broad guidance and assessments to address unexpected dangers, model mismatches, and performance issues arising from local minima. We validate our proposed framework in both simulation and real-world experiments with comprehensive evaluations, which provide the potential for safe LLM integration for multi-robot problems.


Conversational Language Models for Human-in-the-Loop Multi-Robot Coordination

arXiv.org Artificial Intelligence

With the increasing prevalence and diversity of robots interacting in the real world, there is need for flexible, on-the-fly planning and cooperation. Large Language Models are starting to be explored in a multimodal setup for communication, coordination, and planning in robotics. Existing approaches generally use a single agent building a plan, or have multiple homogeneous agents coordinating for a simple task. We present a decentralised, dialogical approach in which a team of agents with different abilities plans solutions through peer-to-peer and human-robot discussion. We suggest that argument-style dialogues are an effective way to facilitate adaptive use of each agent's abilities within a cooperative team. Two robots discuss how to solve a cleaning problem set by a human, define roles, and agree on paths they each take. Each step can be interrupted by a human advisor and agents check their plans with the human. Agents then execute this plan in the real world, collecting rubbish from people in each room. Our implementation uses text at every step, maintaining transparency and effective human-multi-robot interaction.


Human-guided Swarms: Impedance Control-inspired Influence in Virtual Reality Environments

arXiv.org Artificial Intelligence

As the potential for societal integration of multi-agent robotic systems increases [1], the need to manage the collective behaviors of such systems also increases [2, 3, 4]. There has been significant research effort directed towards the examination of how humans can assist in controlling such collective behaviors, such as in human-swarm interactions [5, 6, 7]. Agent-agent interactions in a swarm of small unmanned aerial systems (sUAS) lead to the emergence of collective behaviors that enable effective coverage and exploration across large spatial extents. However, the same inherent collective behaviors can occasionally limit the ability of the sUAS swarm to focus on specific objects of interest during coverage or exploration missions [8]. In these scenarios, the human operator or supervisor should have the opportunity to fractionally revoke or limit emergent swarm behaviors, and guide the swarm to achieve mission objectives. For most applications, including in industry-and defense-related contexts, such human-swarm interaction (HSI) will likely require intuitive and predictable mechanisms of control to quickly translate the input of the human (such as a gesture) to an influence or effect on the sUAS swarm. The goal of our work is to create an intuitive interface for a human supervisor to influence or guide an sUAS swarm without excessive incursions on decentralized control afforded by these systems, while attempting to create more predictable behaviors. This is a potentially valuable approach that can enable the fully utilization of swarm capabilities, while also retaining an ongoing macroscopic-level of swarm control in scenarios where focus on specific regions of interest is required (e.g., search and rescue, surveillance operations) [9]. The influence mechanism has been implemented and tested using 16 drones in a photo-realistic virtual reality (VR) environment (as shown in Figure 1).


IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human Supervisors

arXiv.org Artificial Intelligence

Imitation learning has been applied to a range of robotic tasks, but can struggle when robots encounter edge cases that are not represented in the training data (i.e., distribution shift). Interactive fleet learning (IFL) mitigates distribution shift by allowing robots to access remote human supervisors during task execution and learn from them over time, but different supervisors may demonstrate the task in different ways. Recent work proposes Implicit Behavior Cloning (IBC), which is able to represent multimodal demonstrations using energy-based models (EBMs). In this work, we propose Implicit Interactive Fleet Learning (IIFL), an algorithm that builds on IBC for interactive imitation learning from multiple heterogeneous human supervisors. A key insight in IIFL is a novel approach for uncertainty quantification in EBMs using Jeffreys divergence. While IIFL is more computationally expensive than explicit methods, results suggest that IIFL achieves a 2.8x higher success rate in simulation experiments and a 4.5x higher return on human effort in a physical block pushing task over (Explicit) IFL, IBC, and other baselines.


Heterogeneous robot teams with unified perception and autonomy: How Team CSIRO Data61 tied for the top score at the DARPA Subterranean Challenge

arXiv.org Artificial Intelligence

The DARPA Subterranean Challenge was designed for competitors to develop and deploy teams of autonomous robots to explore difficult unknown underground environments. Categorised in to human-made tunnels, underground urban infrastructure and natural caves, each of these subdomains had many challenging elements for robot perception, locomotion, navigation and autonomy. These included degraded wireless communication, poor visibility due to smoke, narrow passages and doorways, clutter, uneven ground, slippery and loose terrain, stairs, ledges, overhangs, dripping water, and dynamic obstacles that move to block paths among others. In the Final Event of this challenge held in September 2021, the course consisted of all three subdomains. The task was for the robot team to perform a scavenger hunt for a number of pre-defined artefacts within a limited time frame. Only one human supervisor was allowed to communicate with the robots once they were in the course. Points were scored when accurate detections and their locations were communicated back to the scoring server. A total of 8 teams competed in the finals held at the Mega Cavern in Louisville, KY, USA. This article describes the systems deployed by Team CSIRO Data61 that tied for the top score and won second place at the event.


Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision

arXiv.org Artificial Intelligence

Amazon, Nimble, Plus One, Waymo, and Zoox use remote human supervision of robot fleets in applications ranging from self-driving taxis to automated warehouse fulfillment [1, 2, 3, 4, 5]. These robots intermittently cede control during task execution to remote human supervisors for corrective interventions. The interventions take place either during learning, when they are used to improve the robot policy, or during execution, when the policy is no longer updated but robots can still request human assistance when needed to improve reliability. In the continual learning setting, these occur simultaneously: the robot policy has been deployed but continues to be updated indefinitely with additional intervention data. Furthermore, any individual robot can share its intervention data with the rest of the fleet. As opposed to robot swarms that must coordinate with each other to achieve a common objective, a robot fleet is a set of independent robots simultaneously executing the same control policy in parallel environments. We refer to the setting of a robot fleet learning via interactive requests for human supervision (see Figure 1) as Interactive Fleet Learning (IFL). Of central importance in IFL is the supervisor allocation problem: how should limited human supervision be allocated to robots in a manner that maximizes the throughput of the fleet?


Truly autonomous cars may be impossible without helpful human touch

#artificialintelligence

An operator controls a Fetch driverless car from the office of Imperium Drive, during driverless car trials, in Milton Keynes, Britain, June 8, 2022. MILTON KEYNES, England (Reuters) -Autonomous vehicle (AV) startups have raised tens of billions of dollars based on promises to develop truly self-driving cars, but industry executives and experts say remote human supervisors may be needed permanently to help robot drivers in trouble. The central premise of autonomous vehicles – that computers and artificial intelligence will dramatically reduce accidents caused by human error – has driven much of the research and investment. But there is a catch: Making robot cars that can drive more safely than people is immensely tough because self-driving software systems simply lack humans' ability to predict and assess risk quickly, especially when encountering unexpected incidents or "edge cases." "Well, my question would be, 'Why?'" said Kyle Vogt, CEO of Cruise, a unit of General Motors (NYSE:GM), when asked if he could see a point where remote human overseers should be removed from operations.


NASA Sending Two Extra Helicopters to Mars - Channel969

#artificialintelligence

With direct funding plus prize cash that reached into the hundreds of thousands, DARPA inspired worldwide collaborations amongst prime educational establishments in addition to business. A sequence of three preliminary circuit occasions would give groups expertise with every atmosphere. In the course of the Tunnel Circuit occasion, which happened in August 2019 within the Nationwide Institute for Occupational Security and Well being's experimental coal mine, on the outskirts of Pittsburgh, many groups misplaced communication with their robots after the primary bend within the tunnel. Six months later, on the City Circuit occasion, held at an unfinished nuclear energy station in Satsop, Wash., groups beefed up their communications with every part from an easy tethered Ethernet cable to battery-powered mesh community nodes that robots would drop like breadcrumbs as they went alongside, ideally simply earlier than they handed out of communication vary. The Cave Circuit, scheduled for the autumn of 2020, was canceled on account of COVID-19. By the point groups reached the SubT Remaining Occasion within the Louisville Mega Cavern, the main target was on autonomy slightly than communications.


Your AI strategy's secret ingredient

#artificialintelligence

AI is increasingly becoming a business imperative. Nine in 10 Fortune 1000 companies are not only investing in AI, but are increasing those investments, with 92% reporting measurable business benefits from their current AI use -- up from 72% in 2020 and just 28% in 2018, according to a 2022 NewVantage Partners executive survey. Still, only 26% of companies say their AI initiatives have actually moved into widespread production. Cultural barriers, with executives 11 times more likely to say culture is the greatest impediment to AI success than to cite technology limitations as the biggest barrier. And the cultural challenges have actually gotten worse, with 92% of executives citing cultural factors this year vs. 81% in 2018.