Goto

Collaborating Authors

 robotic


Humanoid home robots are on the market – but do we really want them?

Robohub

Humanoid home robots are on the market - but do we really want them? Last year, Norwegian-US tech company 1X announced a strange new product: "the world's first consumer-ready humanoid robot designed to transform life at home". Standing 168 centimetres tall and weighing in at 30 kilograms, the US$20,000 Neo bot promises to automate common household chores such as folding laundry and loading the dishwasher. Neo has a built-in artificial intelligence (AI) system, but for tricky tasks it requires a 1X employee wearing a virtual reality helmet to remotely take over the robot. The operator can see whatever the bot does inside your house, and the process is recorded for future learning.


Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics

Neural Information Processing Systems

Applications of Reinforcement Learning (RL) in robotics are often limited by high data demand. On the other hand, approximate models are readily available in many robotics scenarios, making model-based approaches like planning a data-efficient alternative. Still, the performance of these methods suffers if the model is imprecise or wrong. In this sense, the respective strengths and weaknesses of RL and model-based planners are complementary. In the present work, we investigate how both approaches can be integrated into one framework that combines their strengths. We introduce Learning to Execute (L2E), which leverages information contained in approximate plans to learn universal policies that are conditioned on plans. In our robotic manipulation experiments, L2E exhibits increased performance when compared to pure RL, pure planning, or baseline methods combining learning and planning.


Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

Krauss, Henrik, Licher, Johann, Takeishi, Naoya, Raatz, Annika, Yairi, Takehisa

arXiv.org Artificial Intelligence

Data-driven learning of soft continuum robot (SCR) dynamics from high-dimensional observations offers flexibility but often lacks physical interpretability, while model-based approaches require prior knowledge and can be computationally expensive. We bridge this gap by introducing (1) the Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics learning that generates pixel-accurate attention maps localizing each latent dimension's contribution while filtering static backgrounds. (2) By coupling these attention maps to 2D oscillator networks, we enable direct on-image visualization of learned dynamics (masses, stiffness, and forces) without prior knowledge. We validate our approach on single- and double-segment SCRs, demonstrating that ABCD-based models significantly improve multi-step prediction accuracy: 5.7x error reduction for Koopman operators and 3.5x for oscillator networks on the two-segment robot. The learned oscillator network autonomously discovers a chain structure of oscillators. Unlike standard methods, ABCD models enable smooth latent space extrapolation beyond training data. This fully data-driven approach yields compact, physically interpretable models suitable for control applications.


Introducing V-Soft Pro: a Modular Platform for a Transhumeral Prosthesis with Controllable Stiffness

Milazzo, Giuseppe, Grioli, Giorgio, Bicchi, Antonio, Catalano, Manuel G.

arXiv.org Artificial Intelligence

Current upper limb prostheses aim to enhance user independence in daily activities by incorporating basic motor functions. However, they fall short of replicating the natural movement and interaction capabilities of the human arm. In contrast, human limbs leverage intrinsic compliance and actively modulate joint stiffness, enabling adaptive responses to varying tasks, impact absorption, and efficient energy transfer during dynamic actions. Inspired by this adaptability, we developed a transhumeral prosthesis with Variable Stiffness Actuators (VSAs) to replicate the controllable compliance found in biological joints. The proposed prosthesis features a modular design, allowing customization for different residual limb shapes and accommodating a range of independent control signals derived from users' biological cues. Integrated elastic elements passively support more natural movements, facilitate safe interactions with the environment, and adapt to diverse task requirements. This paper presents a comprehensive overview of the platform and its functionalities, highlighting its potential applications in the field of prosthetics.


X-SYCON: Xylem-Inspired Passive Gradient Control for Communication-Free Swarm Response in Dynamic Disaster Environments

Baek, Arthur Ji Sung, Martin, Geoffrey

arXiv.org Artificial Intelligence

We present X-SYCON, a xylem-inspired multi-agent architecture in which coordination emerges from passive field dynamics rather than explicit planning or communication. Incidents (demands) and obstructions (hazards) continually write diffusing and decaying scalar fields, and agents greedily ascend a local utility $U=ϕ_{\mathrm{DE}}-κ\,ϕ_{\mathrm{HZ}}$ with light anti-congestion and separation. A beaconing rule triggered on first contact temporarily deepens the local demand sink, accelerating completion without reducing time-to-first-response. Across dynamic, partially blocked simulated environments, we observe low miss rates and stable throughput with interpretable, tunable trade-offs over carrier count, arrival rate, hazard density, and hazard sensitivity $κ$. We derive that a characteristic hydraulic length scale $\ell\approx\sqrt{D/λ}$ predicts recruitment range in a continuum approximation, and we provide a work-conservation (Ohm-law) bound consistent with sublinear capacity scaling with team size. Empirically: (i) soft hazard penalties yield fewer misses when obstacles already block motion; (ii) throughput saturates sublinearly with carriers while reliability improves sharply; (iii) stronger arrivals can reduce misses by sustaining sinks that recruit help; and (iv) phase-stability regions shrink with hazard density but are recovered by more carriers or higher arrivals. We refer to X-SYCON as an instance of Distributed Passive Computation and Control, and we evaluate it in simulations modeling communication-denied disaster response and other constrained sensing-action regimes.


Robot-Powered Data Flywheels: Deploying Robots in the Wild for Continual Data Collection and Foundation Model Adaptation

Grannen, Jennifer, Pan, Michelle, Llontop, Kenneth, Ho, Cherie, Zolotas, Mark, Bohg, Jeannette, Sadigh, Dorsa

arXiv.org Artificial Intelligence

Foundation models (FM) have unlocked powerful zero-shot capabilities in vision and language, yet their reliance on internet pretraining data leaves them brittle in unstructured, real-world settings. The messy, real-world data encountered during deployment (e.g. occluded or multilingual text) remains massively underrepresented in existing corpora. Robots, as embodied agents, are uniquely positioned to close this gap: they can act in physical environments to collect large-scale, real-world data that enriches FM training with precisely the examples current models lack. We introduce the Robot-Powered Data Flywheel, a framework that transforms robots from FM consumers into data generators. By deploying robots equipped with FMs in the wild, we enable a virtuous cycle: robots perform useful tasks while collecting real-world data that improves both domain-specific adaptation and domain-adjacent generalization. We instantiate this framework with Scanford, a mobile manipulator deployed in the East Asia Library for 2 weeks. Scanford autonomously scans shelves, identifies books using a vision-language model (VLM), and leverages the library catalog to label images without human annotation. This deployment both aids librarians and produces a dataset to finetune the underlying VLM, improving performance on the domain-specific in-the-wild library setting and on domain-adjacent multilingual OCR benchmarks. Using data collected from 2103 shelves, Scanford improves VLM performance on book identification from 32.0% to 71.8% and boosts domain-adjacent multilingual OCR from 24.8% to 46.6% (English) and 30.8% to 38.0% (Chinese), while saving an ~18.7 hrs of human time. These results highlight how robot-powered data flywheels can both reduce human effort in real deployments and unlock new pathways for continually adapting FMs to the messiness of reality. More details are at: https://scanford-robot.github.io


Robot Talk Episode 134 – Robotics as a hobby, with Kevin McAleer

Robohub

Claire chatted to Kevin McAleer from kevsrobots about how to get started building robots at home. Kevin McAleer is a hobbyist robotics fanatic who likes to build robots, share videos about them on YouTube and teach people how to do the same. Kev has been building robots since 2019, when he got his first 3d printer and wanted to make more interesting builds. Kev has a degree in Computer Science, and because his day job is relatively hands-off, this hobby allows his creativity to have an outlet. Kev is a huge fan of Python and Micropython for embedded devices, and has a website - kevsrobots.com


Google DeepMind Hires Former CTO of Boston Dynamics as the Company Pushes Deeper Into Robotics

WIRED

DeepMind's chief says he envisions Gemini as an operating system for physical robots. The company has hired Aaron Saunders to help make that a reality. Google DeepMind has hired the former Chief Technology Officer of Boston Dynamics as the company pushes deeper into robotics. Aaron Saunders, who is partly responsible for giving the world backflipping and dancing machines, joined as the VP of hardware engineering earlier this month. The hire is a key part of CEO Demis Hassabis' vision for Gemini to become a sort of robot operating system, similar to how Google supplies its Android software to an array of smartphone manufacturers.


Monolithic Units: Actuation, Sensing, and Simulation for Integrated Soft Robot Design

Exley, Trevor, Nardin, Anderson Brazil, Trunin, Petr, Cafiso, Diana, Beccai, Lucia

arXiv.org Artificial Intelligence

This work introduces the Monolithic Unit (MU), an actuator-lattice-sensor building block for soft robotics. The MU integrates pneumatic actuation, a compliant lattice envelope, and candidate sites for optical waveguide sensing into a single printed body. In order to study reproducibility and scalability, a parametric design framework establishes deterministic rules linking actuator chamber dimensions to lattice unit cell size. Experimental homogenization of lattice specimens provides effective material properties for finite element simulation. Within this simulation environment, sensor placement is treated as a discrete optimization problem, where a finite set of candidate waveguide paths derived from lattice nodes is evaluated by introducing local stiffening, and the configuration minimizing deviation from baseline mechanical response is selected. Optimized models are fabricated and experimentally characterized, validating the preservation of mechanical performance while enabling embedded sensing. The workflow is further extended to scaled units and a two-finger gripper, demonstrating generality of the MU concept. This approach advances monolithic soft robotic design by combining reproducible co-design rules with simulation-informed sensor integration.


ARCSnake V2: An Amphibious Multi-Domain Screw-Propelled Snake-Like Robot

Wickenhiser, Sara, Peiros, Lizzie, Joyce, Calvin, Gavrilrov, Peter, Mukherjee, Sujaan, Sylvester, Syler, Zhou, Junrong, Cheung, Mandy, Lim, Jason, Richter, Florian, Yip, Michael C.

arXiv.org Artificial Intelligence

Abstract-- Robotic exploration in extreme environments--such as caves, oceans, and planetary surfaces--poses significant challenges, particularly in locomotion across diverse terrains. Conventional wheeled or legged robots often struggle in these contexts due to surface variability. This paper presents ARCSnake V2, an amphibious, screw-propelled, snake-like robot designed for teleoperated or autonomous locomotion across land, granular media, and aquatic environments. ARCSnake V2 combines the high mobility of hyper-redundant snake robots with the terrain versatility of Archimedean screw propulsion. Key contributions include a water-sealed mechanical design with serially linked screw and joint actuation, an integrated buoyancy control system, and teleoperation via a kinematically-matched handheld controller . The robot's design and control architecture enable multiple locomotion modes--screwing, wheeling, and sidewinding--with smooth transitions between them. Robotic exploration in extreme environments, such as caves, oceans and planetary surfaces, poses significant challenges for the diverse set of terrains [1].