Goto

Collaborating Authors

 Geothermal System for Power Generation


AIVIO: Closed-loop, Object-relative Navigation of UAVs with AI-aided Visual Inertial Odometry

arXiv.org Artificial Intelligence

Object-relative mobile robot navigation is essential for a variety of tasks, e.g. autonomous critical infrastructure inspection, but requires the capability to extract semantic information about the objects of interest from raw sensory data. While deep learning-based (DL) methods excel at inferring semantic object information from images, such as class and relative 6 degree of freedom (6-DoF) pose, they are computationally demanding and thus often not suitable for payload constrained mobile robots. In this letter we present a real-time capable unmanned aerial vehicle (UAV) system for object-relative, closed-loop navigation with a minimal sensor configuration consisting of an inertial measurement unit (IMU) and RGB camera. Utilizing a DL-based object pose estimator, solely trained on synthetic data and optimized for companion board deployment, the object-relative pose measurements are fused with the IMU data to perform object-relative localization. We conduct multiple real-world experiments to validate the performance of our system for the challenging use case of power pole inspection. An example closed-loop flight is presented in the supplementary video.


An uncertainty-aware Digital Shadow for underground multimodal CO2 storage monitoring

arXiv.org Artificial Intelligence

Geological Carbon Storage GCS is arguably the only scalable net-negative CO2 emission technology available While promising subsurface complexities and heterogeneity of reservoir properties demand a systematic approach to quantify uncertainty when optimizing production and mitigating storage risks which include assurances of Containment and Conformance of injected supercritical CO2 As a first step towards the design and implementation of a Digital Twin for monitoring underground storage operations a machine learning based data-assimilation framework is introduced and validated on carefully designed realistic numerical simulations As our implementation is based on Bayesian inference but does not yet support control and decision-making we coin our approach an uncertainty-aware Digital Shadow To characterize the posterior distribution for the state of CO2 plumes conditioned on multi-modal time-lapse data the envisioned Shadow combines techniques from Simulation-Based Inference SBI and Ensemble Bayesian Filtering to establish probabilistic baselines and assimilate multi-modal data for GCS problems that are challenged by large degrees of freedom nonlinear multi-physics non-Gaussianity and computationally expensive to evaluate fluid flow and seismic simulations To enable SBI for dynamic systems a recursive scheme is proposed where the Digital Shadows neural networks are trained on simulated ensembles for their state and observed data well and/or seismic Once training is completed the systems state is inferred when time-lapse field data becomes available In this computational study we observe that a lack of knowledge on the permeability field can be factored into the Digital Shadows uncertainty quantification To our knowledge this work represents the first proof of concept of an uncertainty-aware in-principle scalable Digital Shadow.


MHRC: Closed-loop Decentralized Multi-Heterogeneous Robot Collaboration with Large Language Models

arXiv.org Artificial Intelligence

The integration of large language models (LLMs) with robotics has significantly advanced robots' abilities in perception, cognition, and task planning. The use of natural language interfaces offers a unified approach for expressing the capability differences of heterogeneous robots, facilitating communication between them, and enabling seamless task allocation and collaboration. Currently, the utilization of LLMs to achieve decentralized multi-heterogeneous robot collaborative tasks remains an under-explored area of research. In this paper, we introduce a novel framework that utilizes LLMs to achieve decentralized collaboration among multiple heterogeneous robots. Our framework supports three robot categories, mobile robots, manipulation robots, and mobile manipulation robots, working together to complete tasks such as exploration, transportation, and organization. We developed a rich set of textual feedback mechanisms and chain-of-thought (CoT) prompts to enhance task planning efficiency and overall system performance. The mobile manipulation robot can adjust its base position flexibly, ensuring optimal conditions for grasping tasks. The manipulation robot can comprehend task requirements, seek assistance when necessary, and handle objects appropriately. Meanwhile, the mobile robot can explore the environment extensively, map object locations, and communicate this information to the mobile manipulation robot, thus improving task execution efficiency. We evaluated the framework using PyBullet, creating scenarios with three different room layouts and three distinct operational tasks. We tested various LLM models and conducted ablation studies to assess the contributions of different modules. The experimental results confirm the effectiveness and necessity of our proposed framework.


Closed-loop shape control of deformable linear objects based on Cosserat model

arXiv.org Artificial Intelligence

The robotic shape control of deformable linear objects has garnered increasing interest within the robotics community. Despite recent progress, the majority of shape control approaches can be classified into two main groups: open-loop control, which relies on physically realistic models to represent the object, and closed-loop control, which employs less precise models alongside visual data to compute commands. In this work, we present a novel 3D shape control approach that includes the physically realistic Cosserat model into a closed-loop control framework, using vision feedback to rectify errors in real-time. This approach capitalizes on the advantages of both groups: the realism and precision provided by physics-based models, and the rapid computation, therefore enabling real-time correction of model errors, and robustness to elastic parameter estimation inherent in vision-based approaches. This is achieved by computing a deformation Jacobian derived from both the Cosserat model and visual data. To demonstrate the effectiveness of the method, we conduct a series of shape control experiments where robots are tasked with deforming linear objects towards a desired shape.


Safe and Stable Closed-Loop Learning for Neural-Network-Supported Model Predictive Control

arXiv.org Artificial Intelligence

Safe learning of control policies remains challenging, both in optimal control and reinforcement learning. In this article, we consider safe learning of parametrized predictive controllers that operate with incomplete information about the underlying process. To this end, we employ Bayesian optimization for learning the best parameters from closed-loop data. Our method focuses on the system's overall long-term performance in closed-loop while keeping it safe and stable. Specifically, we parametrize the stage cost function of an MPC using a feedforward neural network. This allows for a high degree of flexibility, enabling the system to achieve a better closed-loop performance with respect to a superordinate measure. However, this flexibility also necessitates safety measures, especially with respect to closed-loop stability. To this end, we explicitly incorporated stability information in the Bayesian-optimization-based learning procedure, thereby achieving rigorous probabilistic safety guarantees. The proposed approach is illustrated using a numeric example.


SEAL: Towards Safe Autonomous Driving via Skill-Enabled Adversary Learning for Closed-Loop Scenario Generation

arXiv.org Artificial Intelligence

Verification and validation of autonomous driving (AD) systems and components is of increasing importance, as such technology increases in real-world prevalence. Safety-critical scenario generation is a key approach to robustify AD policies through closed-loop training. However, existing approaches for scenario generation rely on simplistic objectives, resulting in overly-aggressive or non-reactive adversarial behaviors. To generate diverse adversarial yet realistic scenarios, we propose SEAL, a scenario perturbation approach which leverages learned scoring functions and adversarial, human-like skills. SEAL-perturbed scenarios are more realistic than SOTA baselines, leading to improved ego task success across real-world, in-distribution, and out-of-distribution scenarios, of more than 20%. To facilitate future research, we release our code and tools: https://github.com/cmubig/SEAL


Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation

arXiv.org Artificial Intelligence

Despite significant progress in robotics and embodied AI in recent years, deploying robots for long-horizon tasks remains a great challenge. Majority of prior arts adhere to an open-loop philosophy and lack real-time feedback, leading to error accumulation and undesirable robustness. A handful of approaches have endeavored to establish feedback mechanisms leveraging pixel-level differences or pre-trained visual representations, yet their efficacy and adaptability have been found to be constrained. Inspired by classic closed-loop control systems, we propose CLOVER, a closed-loop visuomotor control framework that incorporates feedback mechanisms to improve adaptive robotic control. CLOVER consists of a text-conditioned video diffusion model for generating visual plans as reference inputs, a measurable embedding space for accurate error quantification, and a feedback-driven controller that refines actions from feedback and initiates replans as needed. Our framework exhibits notable advancement in real-world robotic tasks and achieves state-of-the-art on CALVIN benchmark, improving by 8% over previous open-loop counterparts. Code and checkpoints are maintained at https://github.com/OpenDriveLab/CLOVER.


Promptable Closed-loop Traffic Simulation

arXiv.org Artificial Intelligence

Simulation stands as a cornerstone for safe and efficient autonomous driving development. At its core a simulation system ought to produce realistic, reactive, and controllable traffic patterns. In this paper, we propose ProSim, a multimodal promptable closed-loop traffic simulation framework. ProSim allows the user to give a complex set of numerical, categorical or textual prompts to instruct each agent's behavior and intention. ProSim then rolls out a traffic scenario in a closed-loop manner, modeling each agent's interaction with other traffic participants. Our experiments show that ProSim achieves high prompt controllability given different user prompts, while reaching competitive performance on the Waymo Sim Agents Challenge when no prompt is given. To support research on promptable traffic simulation, we create ProSim-Instruct-520k, a multimodal prompt-scenario paired driving dataset with over 10M text prompts for over 520k real-world driving scenarios. We will release code of ProSim as well as data and labeling tools of ProSim-Instruct-520k at https://ariostgx.github.io/ProSim.


Closed-Loop Magnetic Control of Medical Soft Continuum Robots for Deflection

arXiv.org Artificial Intelligence

Magnetic soft continuum robots (MSCRs) have emerged as powerful devices in endovascular interventions owing to their hyperelastic fibre matrix and enhanced magnetic manipulability. Effective closed-loop control of tethered magnetic devices contributes to the achievement of autonomous vascular robotic surgery. In this article, we employ a magnetic actuation system equipped with a single rotatable permanent magnet to achieve closed-loop deflection control of the MSCR. To this end, we establish a differential kinematic model of MSCRs exposed to non-uniform magnetic fields. The relationship between the existence and uniqueness of Jacobian and the geometric position between robots is deduced. The control direction induced by Jacobian is demonstrated to be crucial in simulations. Then, the corresponding quasi-static control (QSC) framework integrates a linear extended state observer to estimate model uncertainties. Finally, the effectiveness of the proposed QSC framework is validated through comparative trajectory tracking experiments with the PD controller under external disturbances. Further extensions are made for the Jacobian to path-following control at the distal end position. The proposed control framework prevents the actuator from reaching the joint limit and achieves fast and low error-tracking performance without overshooting.


Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling

arXiv.org Artificial Intelligence

The increasing availability of human demonstrations has spurred renewed interest in behavioral cloning [1, 2]. In particular, recent studies have highlighted the potential of learning from large-scale demonstrations to acquire a variety of complex skills [3, 4, 5, 6, 7, 8]. However, this approach still struggles with two common properties of human demonstrations: (i) strong temporal dependencies across multiple steps, such as idle pauses [4] and latent strategies [9, 10], (ii) large style variability across different demonstrations, including differences in proficiency [11] and preference [12]. Oftentimes, both properties are prevalent yet unlabeled in collected data, posing significant challenges to traditional behavioral cloning, which typically learns a discriminative model to map an input state to a target action. In response to these challenges, recent works have pursued a generative approach characterized by two key elements: (i) predicting a sequence of actions over multiple time steps and executing all or part of the sequence, known as action chunking [3] or receding horizon [4]; (ii) modeling the distribution of action chunks and sampling from the learned model in an independent [4, 13] or weakly dependent [3, 14] manner during deployment. Some studies find these elements crucial for learning a performant policy in controlled laboratory scenarios [3, 4], while other recent work reports opposite outcomes under practical conditions [6]. The reasons behind these conflicting results remain unclear.