Energy
A QP Framework for Improving Data Collection: Quantifying Device-Controller Performance in Robot Teleoperation
Zhao, Yuxuan, Tang, Yuanchen, Zhang, Jindi, Yu, Hongyu
Robot learning empowers the robot system with human brain-like intelligence to autonomously acquire and adapt skills through experience, enhancing flexibility and adaptability in various environments. Aimed at achieving a similar level of capability in large language models (LLMs) for embodied intelligence, data quality plays a crucial role in training a foundational model with diverse robot skills. In this study, we investigate the collection of data for manipulation tasks using teleoperation devices. Different devices yield varying effects when paired with corresponding controller strategies, including position-based inverse kinematics (IK) control, torque-based inverse dynamics (ID) control, and optimization-based compliance control. In this paper, we develop a teleoperation pipeline that is compatible with different teleoperation devices and manipulator controllers. Within the pipeline, we construct the optimal QP formulation with the dynamic nullspace and the impedance tracking as the novel optimal controller to achieve compliant pose tracking and singularity avoidance. Regarding the optimal controller, it adaptively adjusts the weights assignment depending on the robot joint manipulability that reflects the state of joint configuration for the pose tracking in the form of impedance control and singularity avoidance with nullspace tracking. Analysis of quantitative experimental results suggests the quality of the teleoperated trajectory data, including tracking error, occurrence of singularity, and the smoothness of the joints' trajectory, with different combinations of teleoperation interface and the motion controller.
CAVER: Curious Audiovisual Exploring Robot
Macesanu, Luca, Folefack, Boueny, Singh, Samik, Ray, Ruchira, Abbatematteo, Ben, Martín-Martín, Roberto
Abstract-- Multimodal audiovisual perception can enable new avenues for robotic manipulation, from better material classification to the imitation of demonstrations for which only audio signals are available (e.g., playing a tune by ear). However, to unlock such multimodal potential, robots need to learn the correlations between an object's visual appearance and the sound it generates when they interact with it. Such an active sensorimotor experience requires new interaction capabilities, representations, and exploration methods to guide the robot in efficiently building increasingly rich audiovisual knowledge. In this work, we present CA VER, a novel robot that builds and utilizes rich audiovisual representations of objects. CA VER includes three novel contributions: 1) a novel 3D printed end-effector, attachable to parallel grippers, that excites objects' audio responses, 2) an audiovisual representation that combines local and global appearance information with sound features, and 3) an exploration algorithm that uses and builds the audiovisual representation in a curiosity-driven manner that prioritizes interacting with high uncertainty objects to obtain good coverage of surprising audio with fewer interactions. We demonstrate that CA VER builds rich representations in different scenarios more efficiently than several exploration baselines, and that the learned audiovisual representation leads to significant improvements in material classification and the imitation of audio-only human demonstrations. Humans learn and exploit multimodal audiovisual cues in everyday life to obtain a more complete understanding of their environment and broader manipulation capabilities. We routinely fuse audio and vision to understand materials and reproduce behaviors: tapping a mug reveals glass vs. ceramic, and hearing a melody lets a musician find the right key. Building similar capabilities in robots would increase their robustness and autonomy, but requires a representation that couples how things look with how they sound when interacted with, and a way to acquire that representation efficiently through interaction.
Resource Allocation in Hybrid Radio-Optical IoT Networks using GNN with Multi-task Learning
Hamrouni, Aymen, Pollin, Sofie, Sallouha, Hazem
This paper addresses the problem of dual-technology scheduling in hybrid Internet of Things (IoT) networks that integrate Optical Wireless Communication (OWC) alongside Radio Frequency (RF). We begin by formulating a Mixed-Integer Nonlinear Programming (MINLP) model that jointly considers throughput maximization and delay minimization between access points and IoT nodes under energy and link availability constraints. However, given the intractability of solving such NP-hard problems at scale and the impractical assumption of full channel observability, we propose the Dual-Graph Embedding with Transformer (DGET) framework, a supervised multi-task learning architecture combining a two-stage Graph Neural Networks (GNNs) with a Transformer-based encoder. The first stage employs a transductive GNN that encodes the known graph topology and initial node and link states (e.g., energy levels, available links, and queued transmissions). The second stage introduces an inductive GNN for temporal refinement, which learns to generalize these embeddings to the evolved states of the same network, capturing changes in energy and queue dynamics over time, by aligning them with ground-truth scheduling decisions through a consistency loss. These enriched embeddings are then processed by a classifier for the communication links with a Transformer encoder that captures cross-link dependencies through multi-head self-attention via classification loss. A further post-processing step refines these predictions using a top-2 feasibility check to further ensure valid link allocations. Simulation results show that hybrid RF-OWC networks outperform standalone RF systems by handling higher traffic loads more efficiently and reducing the Age of Information (AoI) by up to 20%, all while maintaining comparable energy consumption. The proposed DGET framework, compared to traditional optimization-based methods, achieves near-optimal scheduling with over 90% classification accuracy, reduces computational complexity, and demonstrates higher robustness under partial channel observability.
An Evaluation of LLMs Inference on Popular Single-board Computers
Tung, null, Nguyen, null, Nguyen, Tuyen
The growing demand for on-device large language model (LLM) inference is driving interest in deploying lightweight, cost-effective AI solutions on edge hardware. Single-board computers (SBCs) such as the Raspberry Pi and Orange Pi offer a promising platform for localized, privacy-preserving inference-but remain underexplored in the context of LLM workloads. In this work, we benchmark the performance of 25 quantized open-source LLMs across three SBCs-Raspberry Pi 4, Raspberry Pi 5, and Orange Pi 5 Pro-using two inference runtimes: Ollama and Llamafile. We evaluate generation throughput, memory usage, and power consumption under varying CPU configurations, using multiple prompt types to simulate realistic workloads. Our results show that SBCs can reliably support models up to 1.5B parameters, with Llamafile achieving up to 4x higher throughput and 30-40% lower power usage than Ollama. We identify architecture-specific bottlenecks, highlight runtime-level trade-offs, and provide practical deployment recommendations. This study offers the first broad evaluation of LLM inference on SBCs, bridging the gap between high-performance language models and affordable edge computing.
Contact Wasserstein Geodesics for Non-Conservative Schrödinger Bridges
Testa, Andrea, Hauberg, Søren, Asfour, Tamim, Rozo, Leonel
The Schrödinger Bridge provides a principled framework for modeling stochastic processes between distributions; however, existing methods are limited by energy-conservation assumptions, which constrains the bridge's shape preventing it from model varying-energy phenomena. To overcome this, we introduce the non-conservative generalized Schrödinger bridge (NCGSB), a novel, energy-varying reformulation based on contact Hamiltonian mechanics. By allowing energy to change over time, the NCGSB provides a broader class of real-world stochastic processes, capturing richer and more faithful intermediate dynamics. By parameterizing the Wasserstein manifold, we lift the bridge problem to a tractable geodesic computation in a finite-dimensional space. Unlike computationally expensive iterative solutions, our contact Wasserstein geodesic (CWG) is naturally implemented via a ResNet architecture and relies on a non-iterative solver with near-linear complexity. Furthermore, CWG supports guided generation by modulating a task-specific distance metric. We validate our framework on tasks including manifold navigation, molecular dynamics predictions, and image generation, demonstrating its practical benefits and versatility.
Precipitation nowcasting of satellite data using physically-aligned neural networks
Catão, Antônio, Poveda, Melvin, Voltarelli, Leonardo, Orenstein, Paulo
Accurate short-term precipitation forecasts predominantly rely on dense weather-radar networks, limiting operational value in places most exposed to climate extremes. We present TUPANN (Transferable and Universal Physics-Aligned Nowcasting Network), a satellite-only model trained on GOES-16 RRQPE. Unlike most deep learning models for nowcasting, TUPANN decomposes the forecast into physically meaningful components: a variational encoder-decoder infers motion and intensity fields from recent imagery under optical-flow supervision, a lead-time-conditioned MaxViT evolves the latent state, and a differentiable advection operator reconstructs future frames. We evaluate TUPANN on both GOES-16 and IMERG data, in up to four distinct climates (Rio de Janeiro, Manaus, Miami, La Paz) at 10-180min lead times using the CSI and HSS metrics over 4-64 mm/h thresholds. Comparisons against optical-flow, deep learning and hybrid baselines show that TUPANN achieves the best or second-best skill in most settings, with pronounced gains at higher thresholds. Training on multiple cities further improves performance, while cross-city experiments show modest degradation and occasional gains for rare heavy-rain regimes. The model produces smooth, interpretable motion fields aligned with numerical optical flow and runs in near real time due to the low latency of GOES-16. These results indicate that physically aligned learning can provide nowcasts that are skillful, transferable and global.
Seabed-Net: A multi-task network for joint bathymetry estimation and seabed classification from remote sensing imagery in shallow waters
Agrafiotis, Panagiotis, Demir, Begüm
Accurate, detailed, and regularly updated bathymetry, coupled with complex semantic content, is essential for under-mapped shallow-water environments facing increasing climatological and anthropogenic pressures. However, existing approaches that derive either depth or seabed classes from remote sensing imagery treat these tasks in isolation, forfeiting the mutual benefits of their interaction and hindering the broader adoption of deep learning methods. To address these limitations, we introduce Seabed-Net, a unified multi-task framework that simultaneously predicts bathymetry and pixel-based seabed classification from remote sensing imagery of various resolutions. Seabed-Net employs dual-branch encoders for bathymetry estimation and pixel-based seabed classification, integrates cross-task features via an Attention Feature Fusion module and a windowed Swin-Transformer fusion block, and balances objectives through dynamic task uncertainty weighting. In extensive evaluations at two heterogeneous coastal sites, it consistently outperforms traditional empirical models and traditional machine learning regression methods, achieving up to 75\% lower RMSE. It also reduces bathymetric RMSE by 10-30\% compared to state-of-the-art single-task and multi-task baselines and improves seabed classification accuracy up to 8\%. Qualitative analyses further demonstrate enhanced spatial consistency, sharper habitat boundaries, and corrected depth biases in low-contrast regions. These results confirm that jointly modeling depth with both substrate and seabed habitats yields synergistic gains, offering a robust, open solution for integrated shallow-water mapping. Code and pretrained weights are available at https://github.com/pagraf/Seabed-Net.
NeRC: Neural Ranging Correction through Differentiable Moving Horizon Location Estimation
Weng, Xu, Ling, K. V., Liu, Haochen, Wang, Bingheng, Cao, Kun
GNSS localization using everyday mobile devices is challenging in urban environments, as ranging errors caused by the complex propagation of satellite signals and low-quality onboard GNSS hardware are blamed for undermining positioning accuracy. Researchers have pinned their hopes on data-driven methods to regress such ranging errors from raw measurements. However, the grueling annotation of ranging errors impedes their pace. This paper presents a robust end-to-end Neural Ranging Correction (NeRC) framework, where localization-related metrics serve as the task objective for training the neural modules. Instead of seeking impractical ranging error labels, we train the neural network using ground-truth locations that are relatively easy to obtain. This functionality is supported by differentiable moving horizon location estimation (MHE) that handles a horizon of measurements for positioning and backpropagates the gradients for training. Even better, as a blessing of end-to-end learning, we propose a new training paradigm using Euclidean Distance Field (EDF) cost maps, which alleviates the demands on labeled locations. We evaluate the proposed NeRC on public benchmarks and our collected datasets, demonstrating its distinguished improvement in positioning accuracy. We also deploy NeRC on the edge to verify its real-time performance for mobile devices.
Dominican Republic suffers nationwide power cut after 'cascade of failures'
Dominican Republic suffers nationwide power cut after'cascade of failures' The Dominican Republic has experienced a nationwide power cut which officials said was linked to a failure in the electricity transmission system. At 13:23 local time (17:23 GMT) an issue at a substation caused a nationwide interruption to power services, the state-owned Dominican Electricity Transmission Company said, citing the country's energy minister Joel Santos Echeverría. Echeverría said a thorough investigation would be carried out to identify the cause and that work was under way to quickly restore power. The Caribbean nation, which is home to around 11 million people, has been experiencing smaller blackouts in recent weeks, the AFP news agency reports. Officials at the state-owned power company said generation units in two major power plants had shut down, causing a cascade of failures in other parts of the grid.
The Download: surviving extreme temperatures, and the big whale-wind turbine conspiracy
Plus: China is catching up with America's AI lead Climate change is subjecting vulnerable people to temperatures that push their limits. In 2023, about 47,000 heat-related deaths are believed to have occurred in Europe. Researchers estimate that climate change could add an extra 2.3 million European heat deaths this century. That's heightened the stakes for solving the mystery of just what happens to bodies in extreme conditions. While we broadly know how people thermoregulate, the science of keeping warm or cool is mottled with blind spots. Researchers around the world are revising rules about when extremes veer from uncomfortable to deadly.