Energy
Forecasting Multivariate Urban Data via Decomposition and Spatio-Temporal Graph Analysis
Sohrabbeig, Amirhossein, Ardakanian, Omid, Musilek, Petr
Long-term forecasting of multivariate urban data poses a significant challenge due to the complex spatiotemporal dependencies inherent in such datasets. This paper presents DST, a novel multivariate time-series forecasting model that integrates graph attention and temporal convolution within a Graph Neural Network (GNN) to effectively capture spatial and temporal dependencies, respectively. To enhance model performance, we apply a decomposition-based preprocessing step that isolates trend, seasonal, and residual components of the time series, enabling the learning of distinct graph structures for different time-series components. Extensive experiments on real-world urban datasets, including electricity demand, weather metrics, carbon intensity, and air pollution, demonstrate the effectiveness of DST across a range of forecast horizons, from several days to one month. Specifically, our approach achieves an average improvement of 2.89% to 9.10% in long-term forecasting accuracy over state-of-the-art time-series forecasting models.
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data
Deng, Shengliang, Yan, Mi, Wei, Songlin, Ma, Haixin, Yang, Yuxin, Chen, Jiayi, Zhang, Zhiqi, Yang, Taoyu, Zhang, Xuheng, Zhang, Wenhao, Cui, Heming, Zhang, Zhizheng, Wang, He
Embodied foundation models are gaining increasing attention for their zero-shot generalization, scalability, and adaptability to new tasks through few-shot post-training. However, existing models rely heavily on real-world data, which is costly and labor-intensive to collect. Synthetic data offers a cost-effective alternative, yet its potential remains largely underexplored. To bridge this gap, we explore the feasibility of training Vision-Language-Action models entirely with large-scale synthetic action data. We curate SynGrasp-1B, a billion-frame robotic grasping dataset generated in simulation with photorealistic rendering and extensive domain randomization. Building on this, we present GraspVLA, a VLA model pretrained on large-scale synthetic action data as a foundational model for grasping tasks. GraspVLA integrates autoregressive perception tasks and flow-matching-based action generation into a unified Chain-of-Thought process, enabling joint training on synthetic action data and Internet semantics data. This design helps mitigate sim-to-real gaps and facilitates the transfer of learned actions to a broader range of Internet-covered objects, achieving open-vocabulary generalization in grasping. Extensive evaluations across real-world and simulation benchmarks demonstrate GraspVLA's advanced zero-shot generalizability and few-shot adaptability to specific human preferences. We will release SynGrasp-1B dataset and pre-trained weights to benefit the community.
RoboComm: A DID-based scalable and privacy-preserving Robot-to-Robot interaction over state channels
Singh, Roshan, Pandey, Sushant
In a multi robot system establishing trust amongst untrusted robots from different organisations while preserving a robot's privacy is a challenge. Recently decentralized technologies such as smart contract and blockchain are being explored for applications in robotics. However, the limited transaction processing and high maintenance cost hinder the widespread adoption of such approaches. Moreover, blockchain transactions be they on public or private permissioned blockchain are publically readable which further fails to preserve the confidentiality of the robot's data and privacy of the robot. In this work, we propose RoboComm a Decentralized Identity based approach for privacy-preserving interaction between robots. With DID a component of Self-Sovereign Identity; robots can authenticate each other independently without relying on any third-party service. Verifiable Credentials enable private data associated with a robot to be stored within the robot's hardware, unlike existing blockchain based approaches where the data has to be on the blockchain. We improve throughput by allowing message exchange over state channels. Being a blockchain backed solution RoboComm provides a trustworthy system without relying on a single party. Moreover, we implement our proposed approach to demonstrate the feasibility of our solution.
Embodied Intelligence for Sustainable Flight: A Soaring Robot with Active Morphological Control
Elmkaiel, Ghadeer, Schmitt, Syn, Muehlebach, Michael
Achieving both agile maneuverability and high energy efficiency in aerial robots, particularly in dynamic wind environments, remains challenging. Conventional thruster-powered systems offer agility but suffer from high energy consumption, while fixed-wing designs are efficient but lack hovering and maneuvering capabilities. We present Floaty, a shape-changing robot that overcomes these limitations by passively soaring, harnessing wind energy through intelligent morphological control inspired by birds. Floaty's design is optimized for passive stability, and its control policy is derived from an experimentally learned aerodynamic model, enabling precise attitude and position control without active propulsion. Wind tunnel experiments demonstrate Floaty's ability to hover, maneuver, and reject disturbances in vertical airflows up to 10 m/s. Crucially, Floaty achieves this with a specific power consumption of 10 W/kg, an order of magnitude lower than thruster-powered systems. This introduces a paradigm for energy-efficient aerial robotics, leveraging morphological intelligence and control to operate sustainably in challenging wind conditions.
Impedance Primitive-augmented Hierarchical Reinforcement Learning for Sequential Tasks
Tahmaz, Amin Berjaoui, Prakash, Ravi, Kober, Jens
This paper presents an Impedance Primitive-augmented hierarchical reinforcement learning framework for efficient robotic manipulation in sequential contact tasks. We leverage this hierarchical structure to sequentially execute behavior primitives with variable stiffness control capabilities for contact tasks. Our proposed approach relies on three key components: an action space enabling variable stiffness control, an adaptive stiffness controller for dynamic stiffness adjustments during primitive execution, and affordance coupling for efficient exploration while encouraging compliance. Through comprehensive training and evaluation, our framework learns efficient stiffness control capabilities and demonstrates improvements in learning efficiency, compositionality in primitive selection, and success rates compared to the state-of-the-art. The training environments include block lifting, door opening, object pushing, and surface cleaning. Real world evaluations further confirm the framework's sim2real capability. This work lays the foundation for more adaptive and versatile robotic manipulation systems, with potential applications in more complex contact-based tasks.
Energy-Efficient Learning-Based Beamforming for ISAC-Enabled V2X Networks
Shang, Chen, Yu, Jiadong, Hoang, Dinh Thai
This work proposes an energy-efficient, learning-based beamforming scheme for integrated sensing and communication (ISAC)-enabled V2X networks. Specifically, we first model the dynamic and uncertain nature of V2X environments as a Markov Decision Process. This formulation allows the roadside unit to generate beamforming decisions based solely on current sensing information, thereby eliminating the need for frequent pilot transmissions and extensive channel state information acquisition. We then develop a deep reinforcement learning (DRL) algorithm to jointly optimize beamforming and power allocation, ensuring both communication throughput and sensing accuracy in highly dynamic scenario. To address the high energy demands of conventional learning-based schemes, we embed spiking neural networks (SNNs) into the DRL framework. Leveraging their event-driven and sparsely activated architecture, SNNs significantly enhance energy efficiency while maintaining robust performance. Simulation results confirm that the proposed method achieves substantial energy savings and superior communication performance, demonstrating its potential to support green and sustainable connectivity in future V2X systems.
Distribution Shift Aware Neural Tabular Learning
Ying, Wangyang, Gong, Nanxu, Wang, Dongjie, Wang, Xinyuan, Malarkkan, Arun Vignesh, Gupta, Vivek, Reddy, Chandan K., Fu, Yanjie
Tabular learning transforms raw features into optimized spaces for downstream tasks, but its effectiveness deteriorates under distribution shifts between training and testing data. We formalize this challenge as the Distribution Shift Tabular Learning (DSTL) problem and propose a novel Shift-Aware Feature Transformation (SAFT) framework to address it. SAFT reframes tabular learning from a discrete search task into a continuous representation-generation paradigm, enabling differentiable optimization over transformed feature sets. SAFT integrates three mechanisms to ensure robustness: (i) shift-resistant representation via embedding decorrelation and sample reweighting, (ii) flatness-aware generation through suboptimal embedding averaging, and (iii) normalization-based alignment between training and test distributions. Extensive experiments show that SAFT consistently outperforms prior tabular learning methods in terms of robustness, effectiveness, and generalization ability under diverse real-world distribution shifts.
FlipWalker: Jacob's Ladder toy-inspired robot for locomotion across diverse, complex terrain
Li, Diancheng, Ralston, Nia, Hagen, Bastiaan, Tan, Phoebe, Robertson, Matthew A.
-- This paper introduces FlipWalker, a novel under-actuated robot locomotion system inspired by Jacob's Ladder illusion toy, designed to traverse challenging terrains where wheeled robots often struggle. Like the Jacob's Ladder toy, Flip-Walker features two interconnected segments joined by flexible cables, enabling it to pivot and flip around singularities in a manner reminiscent of the toy's cascading motion. Actuation is provided by motor-driven legs within each segment that push off either the ground or the opposing segment, depending on the robot's current configuration. A physics-based model of the underactuated flipping dynamics is formulated to elucidate the critical design parameters governing forward motion and obstacle clearance or climbing. The untethered prototype weighs 0.78 kg, achieves a maximum flipping speed of 0.2 body lengths per second. Experimental trials on artificial grass, river rocks, and snow demonstrate that FlipWalker's flipping strategy, which relies on ground reaction forces applied normal to the surface, offers a promising alternative to traditional locomotion for navigating irregular outdoor terrain. In the realm of robotic locomotion, an evolving array of techniques has been harnessed to navigate increasingly complex terrains.
Automated classification of natural habitats using ground-level imagery
Tourian, Mahdis, Rowlands, Sareh, Vandaele, Remy, Fancourt, Max, Mein, Rebecca, Williams, Hywel T. P.
Accurate classification of terrestrial habitats is critical for biodiversity conservation, ecological monitoring, and land-use planning. Several habitat classification schemes are in use, typically based on analysis of satellite imagery with validation by field ecologists. Here we present a methodology for classification of habitats based solely on ground-level imagery (photographs), offering improved validation and the ability to classify habitats at scale (for example using citizen-science imagery). In collaboration with Natural England, a public sector organisation responsible for nature conservation in England, this study develops a classification system that applies deep learning to ground-level habitat photographs, categorising each image into one of 18 classes defined by the 'Living England' framework. Images were pre-processed using resizing, normalisation, and augmentation; re-sampling was used to balance classes in the training data and enhance model robustness. We developed and fine-tuned a DeepLabV3-ResNet101 classifier to assign a habitat class label to each photograph. Using five-fold cross-validation, the model demonstrated strong overall performance across 18 habitat classes, with accuracy and F1-scores varying between classes. Across all folds, the model achieved a mean F1-score of 0.61, with visually distinct habitats such as Bare Soil, Silt and Peat (BSSP) and Bare Sand (BS) reaching values above 0.90, and mixed or ambiguous classes scoring lower. These findings demonstrate the potential of this approach for ecological monitoring. Ground-level imagery is readily obtained, and accurate computational methods for habitat classification based on such data have many potential applications. To support use by practitioners, we also provide a simple web application that classifies uploaded images using our model.
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Puthumanaillam, Gokul, Penumarti, Aditya, Vora, Manav, Padrao, Paulo, Fuentes, Jose, Bobadilla, Leonardo, Shin, Jane, Ornik, Melkior
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.