Goto

Collaborating Authors

 robotic


Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

Krauss, Henrik, Licher, Johann, Takeishi, Naoya, Raatz, Annika, Yairi, Takehisa

arXiv.org Artificial Intelligence

Data-driven learning of soft continuum robot (SCR) dynamics from high-dimensional observations offers flexibility but often lacks physical interpretability, while model-based approaches require prior knowledge and can be computationally expensive. We bridge this gap by introducing (1) the Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics learning that generates pixel-accurate attention maps localizing each latent dimension's contribution while filtering static backgrounds. (2) By coupling these attention maps to 2D oscillator networks, we enable direct on-image visualization of learned dynamics (masses, stiffness, and forces) without prior knowledge. We validate our approach on single- and double-segment SCRs, demonstrating that ABCD-based models significantly improve multi-step prediction accuracy: 5.7x error reduction for Koopman operators and 3.5x for oscillator networks on the two-segment robot. The learned oscillator network autonomously discovers a chain structure of oscillators. Unlike standard methods, ABCD models enable smooth latent space extrapolation beyond training data. This fully data-driven approach yields compact, physically interpretable models suitable for control applications.


Introducing V-Soft Pro: a Modular Platform for a Transhumeral Prosthesis with Controllable Stiffness

Milazzo, Giuseppe, Grioli, Giorgio, Bicchi, Antonio, Catalano, Manuel G.

arXiv.org Artificial Intelligence

Current upper limb prostheses aim to enhance user independence in daily activities by incorporating basic motor functions. However, they fall short of replicating the natural movement and interaction capabilities of the human arm. In contrast, human limbs leverage intrinsic compliance and actively modulate joint stiffness, enabling adaptive responses to varying tasks, impact absorption, and efficient energy transfer during dynamic actions. Inspired by this adaptability, we developed a transhumeral prosthesis with Variable Stiffness Actuators (VSAs) to replicate the controllable compliance found in biological joints. The proposed prosthesis features a modular design, allowing customization for different residual limb shapes and accommodating a range of independent control signals derived from users' biological cues. Integrated elastic elements passively support more natural movements, facilitate safe interactions with the environment, and adapt to diverse task requirements. This paper presents a comprehensive overview of the platform and its functionalities, highlighting its potential applications in the field of prosthetics.


X-SYCON: Xylem-Inspired Passive Gradient Control for Communication-Free Swarm Response in Dynamic Disaster Environments

Baek, Arthur Ji Sung, Martin, Geoffrey

arXiv.org Artificial Intelligence

We present X-SYCON, a xylem-inspired multi-agent architecture in which coordination emerges from passive field dynamics rather than explicit planning or communication. Incidents (demands) and obstructions (hazards) continually write diffusing and decaying scalar fields, and agents greedily ascend a local utility $U=ϕ_{\mathrm{DE}}-κ\,ϕ_{\mathrm{HZ}}$ with light anti-congestion and separation. A beaconing rule triggered on first contact temporarily deepens the local demand sink, accelerating completion without reducing time-to-first-response. Across dynamic, partially blocked simulated environments, we observe low miss rates and stable throughput with interpretable, tunable trade-offs over carrier count, arrival rate, hazard density, and hazard sensitivity $κ$. We derive that a characteristic hydraulic length scale $\ell\approx\sqrt{D/λ}$ predicts recruitment range in a continuum approximation, and we provide a work-conservation (Ohm-law) bound consistent with sublinear capacity scaling with team size. Empirically: (i) soft hazard penalties yield fewer misses when obstacles already block motion; (ii) throughput saturates sublinearly with carriers while reliability improves sharply; (iii) stronger arrivals can reduce misses by sustaining sinks that recruit help; and (iv) phase-stability regions shrink with hazard density but are recovered by more carriers or higher arrivals. We refer to X-SYCON as an instance of Distributed Passive Computation and Control, and we evaluate it in simulations modeling communication-denied disaster response and other constrained sensing-action regimes.


Robot-Powered Data Flywheels: Deploying Robots in the Wild for Continual Data Collection and Foundation Model Adaptation

Grannen, Jennifer, Pan, Michelle, Llontop, Kenneth, Ho, Cherie, Zolotas, Mark, Bohg, Jeannette, Sadigh, Dorsa

arXiv.org Artificial Intelligence

Foundation models (FM) have unlocked powerful zero-shot capabilities in vision and language, yet their reliance on internet pretraining data leaves them brittle in unstructured, real-world settings. The messy, real-world data encountered during deployment (e.g. occluded or multilingual text) remains massively underrepresented in existing corpora. Robots, as embodied agents, are uniquely positioned to close this gap: they can act in physical environments to collect large-scale, real-world data that enriches FM training with precisely the examples current models lack. We introduce the Robot-Powered Data Flywheel, a framework that transforms robots from FM consumers into data generators. By deploying robots equipped with FMs in the wild, we enable a virtuous cycle: robots perform useful tasks while collecting real-world data that improves both domain-specific adaptation and domain-adjacent generalization. We instantiate this framework with Scanford, a mobile manipulator deployed in the East Asia Library for 2 weeks. Scanford autonomously scans shelves, identifies books using a vision-language model (VLM), and leverages the library catalog to label images without human annotation. This deployment both aids librarians and produces a dataset to finetune the underlying VLM, improving performance on the domain-specific in-the-wild library setting and on domain-adjacent multilingual OCR benchmarks. Using data collected from 2103 shelves, Scanford improves VLM performance on book identification from 32.0% to 71.8% and boosts domain-adjacent multilingual OCR from 24.8% to 46.6% (English) and 30.8% to 38.0% (Chinese), while saving an ~18.7 hrs of human time. These results highlight how robot-powered data flywheels can both reduce human effort in real deployments and unlock new pathways for continually adapting FMs to the messiness of reality. More details are at: https://scanford-robot.github.io


Robot Talk Episode 134 – Robotics as a hobby, with Kevin McAleer

Robohub

Claire chatted to Kevin McAleer from kevsrobots about how to get started building robots at home. Kevin McAleer is a hobbyist robotics fanatic who likes to build robots, share videos about them on YouTube and teach people how to do the same. Kev has been building robots since 2019, when he got his first 3d printer and wanted to make more interesting builds. Kev has a degree in Computer Science, and because his day job is relatively hands-off, this hobby allows his creativity to have an outlet. Kev is a huge fan of Python and Micropython for embedded devices, and has a website - kevsrobots.com


Google DeepMind Hires Former CTO of Boston Dynamics as the Company Pushes Deeper Into Robotics

WIRED

DeepMind's chief says he envisions Gemini as an operating system for physical robots. The company has hired Aaron Saunders to help make that a reality. Google DeepMind has hired the former Chief Technology Officer of Boston Dynamics as the company pushes deeper into robotics. Aaron Saunders, who is partly responsible for giving the world backflipping and dancing machines, joined as the VP of hardware engineering earlier this month. The hire is a key part of CEO Demis Hassabis' vision for Gemini to become a sort of robot operating system, similar to how Google supplies its Android software to an array of smartphone manufacturers.


Monolithic Units: Actuation, Sensing, and Simulation for Integrated Soft Robot Design

Exley, Trevor, Nardin, Anderson Brazil, Trunin, Petr, Cafiso, Diana, Beccai, Lucia

arXiv.org Artificial Intelligence

This work introduces the Monolithic Unit (MU), an actuator-lattice-sensor building block for soft robotics. The MU integrates pneumatic actuation, a compliant lattice envelope, and candidate sites for optical waveguide sensing into a single printed body. In order to study reproducibility and scalability, a parametric design framework establishes deterministic rules linking actuator chamber dimensions to lattice unit cell size. Experimental homogenization of lattice specimens provides effective material properties for finite element simulation. Within this simulation environment, sensor placement is treated as a discrete optimization problem, where a finite set of candidate waveguide paths derived from lattice nodes is evaluated by introducing local stiffening, and the configuration minimizing deviation from baseline mechanical response is selected. Optimized models are fabricated and experimentally characterized, validating the preservation of mechanical performance while enabling embedded sensing. The workflow is further extended to scaled units and a two-finger gripper, demonstrating generality of the MU concept. This approach advances monolithic soft robotic design by combining reproducible co-design rules with simulation-informed sensor integration.


ARCSnake V2: An Amphibious Multi-Domain Screw-Propelled Snake-Like Robot

Wickenhiser, Sara, Peiros, Lizzie, Joyce, Calvin, Gavrilrov, Peter, Mukherjee, Sujaan, Sylvester, Syler, Zhou, Junrong, Cheung, Mandy, Lim, Jason, Richter, Florian, Yip, Michael C.

arXiv.org Artificial Intelligence

Abstract-- Robotic exploration in extreme environments--such as caves, oceans, and planetary surfaces--poses significant challenges, particularly in locomotion across diverse terrains. Conventional wheeled or legged robots often struggle in these contexts due to surface variability. This paper presents ARCSnake V2, an amphibious, screw-propelled, snake-like robot designed for teleoperated or autonomous locomotion across land, granular media, and aquatic environments. ARCSnake V2 combines the high mobility of hyper-redundant snake robots with the terrain versatility of Archimedean screw propulsion. Key contributions include a water-sealed mechanical design with serially linked screw and joint actuation, an integrated buoyancy control system, and teleoperation via a kinematically-matched handheld controller . The robot's design and control architecture enable multiple locomotion modes--screwing, wheeling, and sidewinding--with smooth transitions between them. Robotic exploration in extreme environments, such as caves, oceans and planetary surfaces, poses significant challenges for the diverse set of terrains [1].


Scalable Coverage Trajectory Synthesis on GPUs as Statistical Inference

Sun, Max M., Kwon, Jueun, Murphey, Todd

arXiv.org Artificial Intelligence

Abstract--Coverage motion planning is essential to a wide range of robotic tasks. Unlike conventional motion planning problems, which reason over temporal sequences of states, coverage motion planning requires reasoning over the spatial distribution of entire trajectories, making standard motion planning methods limited in computational efficiency and less amenable to modern parallelization frameworks. In this work, we formulate the coverage motion planning problem as a statistical inference problem from the perspective of flow matching, a generative modeling technique that has gained significant attention in recent years. The proposed formulation unifies commonly used statistical discrepancy measures, such as Kullback-Leibler divergence and Sinkhorn divergence, with a standard linear quadratic regulator problem. This paper focuses on the advantages of this formulation in terms of scalability through parallelization, highlighting its computational benefits compared to conventional methods based on waypoint tracking.


MLM: Learning Multi-task Loco-Manipulation Whole-Body Control for Quadruped Robot with Arm

Liu, Xin, Ma, Bida, Qi, Chenkun, Ding, Yan, Xu, Nuo, Zhaxizhuoma, null, Zhang, Guorong, Chen, Pengan, Liu, Kehui, Jia, Zhongjie, Guan, Chuyue, Mo, Yule, Liu, Jiaqi, Gao, Feng, Zhong, Jiangwei, Zhao, Bin, Li, Xuelong

arXiv.org Artificial Intelligence

Whole-body loco-manipulation for quadruped robots with arms remains a challenging problem, particularly in achieving multi-task control. To address this, we propose MLM, a reinforcement learning framework driven by both real-world and simulation data. It enables a six-DoF robotic arm-equipped quadruped robot to perform whole-body loco-manipulation for multiple tasks autonomously or under human teleoperation. To address the problem of balancing multiple tasks during the learning of loco-manipulation, we introduce a trajectory library with an adaptive, curriculum-based sampling mechanism. This approach allows the policy to efficiently leverage real-world collected trajectories for learning multi-task loco-manipulation. To address deployment scenarios with only historical observations and to enhance the performance of policy execution across tasks with different spatial ranges, we propose a Trajectory-Velocity Prediction policy network. It predicts unobservable future trajectories and velocities. By leveraging extensive simulation data and curriculum-based rewards, our controller achieves whole-body behaviors in simulation and zero-shot transfer to real-world deployment. Ablation studies in simulation verify the necessity and effectiveness of our approach, while real-world experiments on a Go2 robot with an Airbot robotic arm demonstrate the policy's good performance in multi-task execution.