Energy
Federated Learning in Open- and Closed-Loop EMG Decoding: A Privacy and Performance Perspective
Malcolm, Kai, Uribe, César, Yamagami, Momona
Invasive and non-invasive neural interfaces hold promise as high-bandwidth input devices for next-generation technologies. However, neural signals inherently encode sensitive information about an individual's identity and health, making data sharing for decoder training a critical privacy challenge. Federated learning (FL), a distributed, privacy-preserving learning framework, presents a promising solution, but it remains unexplored in closed-loop adaptive neural interfaces. Here, we introduce FL-based neural decoding and systematically evaluate its performance and privacy using high-dimensional electromyography signals in both open- and closed-loop scenarios. In open-loop simulations, FL significantly outperformed local learning baselines, demonstrating its potential for high-performance, privacy-conscious neural decoding. In contrast, closed-loop user studies required adapting FL methods to accommodate single-user, real-time interactions, a scenario not supported by standard FL. This modification resulted in local learning decoders surpassing the adapted FL approach in closed-loop performance, yet local learning still carried higher privacy risks. Our findings highlight a critical performance-privacy tradeoff in real-time adaptive applications and indicate the need for FL methods specifically designed for co-adaptive, single-user applications.
A Survey of Explainable Reinforcement Learning: Targets, Methods and Needs
The success of recent Artificial Intelligence (AI) models has been accompanied by the opacity of their internal mechanisms, due notably to the use of deep neural networks. In order to understand these internal mechanisms and explain the output of these AI models, a set of methods have been proposed, grouped under the domain of eXplainable AI (XAI). This paper focuses on a sub-domain of XAI, called eXplainable Reinforcement Learning (XRL), which aims to explain the actions of an agent that has learned by reinforcement learning. We propose an intuitive taxonomy based on two questions "What" and "How". The first question focuses on the target that the method explains, while the second relates to the way the explanation is provided. We use this taxonomy to provide a state-of-the-art review of over 250 papers. In addition, we present a set of domains close to XRL, which we believe should get attention from the community. Finally, we identify some needs for the field of XRL.
Online Adaptation of Terrain-Aware Dynamics for Planning in Unstructured Environments
Ward, William, Etter, Sarah, Ingebrand, Tyler, Ellis, Christian, Thorpe, Adam J., Topcu, Ufuk
--Autonomous mobile robots operating in remote, unstructured environments must adapt to new, unpredictable terrains that can change rapidly during operation. In such scenarios, a critical challenge becomes estimating the robot's dynamics on changing terrain in order to enable reliable, accurate navigation and planning. We present a novel online adaptation approach for terrain-aware dynamics modeling and planning using function encoders. By learning a set of neural network basis functions that span the robot dynamics on diverse terrains, we enable rapid online adaptation to new, unseen terrains and environments as a simple least-squares calculation. We demonstrate our approach for terrain adaptation in a Unity-based robotics simulator and show that the downstream controller has better empirical performance due to higher accuracy of the learned model. This leads to fewer collisions with obstacles while navigating in cluttered environments as compared to a neural ODE baseline. Rapid adaptation to unknown environments and terrain is critical for autonomous mobile robots. In off-road navigation, unpredictable terrain features such as rocky paths, forest floors, and wet fields can cause skidding, tripping, or immobilization, jeopardizing the robot's ability to reach its objective. Autonomous ground vehicles must therefore dynamically adjust their behavior to terrain-specific conditions. This adaptation is challenging because terrain variations directly alter system dynamics. For example, tire response to acceleration depends on surface friction.
GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM
Ahn, Kyeongjin, Han, Sungwon, Lee, Seungeon, Ahn, Donghyun, Kim, Hyoshin, Kim, Jungwon, Kim, Jihee, Park, Sangyoon, Cha, Meeyoung
Socio-economic indicators like regional GDP, population, and education levels, are crucial to shaping policy decisions and fostering sustainable development. This research introduces GeoReg a regression model that integrates diverse data sources, including satellite imagery and web-based geospatial information, to estimate these indicators even for data-scarce regions such as developing countries. Our approach leverages the prior knowledge of large language model (LLM) to address the scarcity of labeled data, with the LLM functioning as a data engineer by extracting informative features to enable effective estimation in few-shot settings. Specifically, our model obtains contextual relationships between data features and the target indicator, categorizing their correlations as positive, negative, mixed, or irrelevant. These features are then fed into the linear estimator with tailored weight constraints for each category. To capture nonlinear patterns, the model also identifies meaningful feature interactions and integrates them, along with nonlinear transformations. Experiments across three countries at different stages of development demonstrate that our model outperforms baselines in estimating socio-economic indicators, even for low-income countries with limited data availability.
The carbon cost of materials discovery: Can machine learning really accelerate the discovery of new photovoltaics?
Walker, Matthew, Butler, Keith T.
Computational screening has become a powerful complement to experimental efforts in the discovery of high-performance photovoltaic (PV) materials. Most workflows rely on density functional theory (DFT) to estimate electronic and optical properties relevant to solar energy conversion. Although more efficient than laboratory-based methods, DFT calculations still entail substantial computational and environmental costs. Machine learning (ML) models have recently gained attention as surrogates for DFT, offering drastic reductions in resource use with competitive predictive performance. In this study, we reproduce a canonical DFT-based workflow to estimate the maximum efficiency limit and progressively replace its components with ML surrogates. By quantifying the CO$_2$ emissions associated with each computational strategy, we evaluate the trade-offs between predictive efficacy and environmental cost. Our results reveal multiple hybrid ML/DFT strategies that optimize different points along the accuracy--emissions front. We find that direct prediction of scalar quantities, such as maximum efficiency, is significantly more tractable than using predicted absorption spectra as an intermediate step. Interestingly, ML models trained on DFT data can outperform DFT workflows using alternative exchange--correlation functionals in screening applications, highlighting the consistency and utility of data-driven approaches. We also assess strategies to improve ML-driven screening through expanded datasets and improved model architectures tailored to PV-relevant features. This work provides a quantitative framework for building low-emission, high-throughput discovery pipelines.
The Power of Architecture: Deep Dive into Transformer Architectures for Long-Term Time Series Forecasting
Shen, Lefei, Chen, Mouxiang, Fu, Han, Ren, Xiaoxue, Wang, Xiaoyun Joy, Sun, Jianling, Li, Zhuo, Liu, Chenghao
Transformer-based models have recently become dominant in Long-term Time Series Forecasting (LTSF), yet the variations in their architecture, such as encoder-only, encoder-decoder, and decoder-only designs, raise a crucial question: What Transformer architecture works best for LTSF tasks? However, existing models are often tightly coupled with various time-series-specific designs, making it difficult to isolate the impact of the architecture itself. To address this, we propose a novel taxonomy that disentangles these designs, enabling clearer and more unified comparisons of Transformer architectures. Our taxonomy considers key aspects such as attention mechanisms, forecasting aggregations, forecasting paradigms, and normalization layers. Through extensive experiments, we uncover several key insights: bi-directional attention with joint-attention is most effective; more complete forecasting aggregation improves performance; and the direct-mapping paradigm outperforms autoregressive approaches. Furthermore, our combined model, utilizing optimal architectural choices, consistently outperforms several existing models, reinforcing the validity of our conclusions. We hope these findings offer valuable guidance for future research on Transformer architectural designs in LTSF. Our code is available at https://github.com/HALF111/TSF_architecture.
CubeSat Orbit Insertion Maneuvering Using J2 Perturbation
Alandihallaj, M. Amin, Emami, M. Reza
The precise insertion of CubeSats into designated orbits is a complex task, primarily due to the limited propulsion capabilities and constrained fuel reserves onboard, which severely restrict the scope for large orbital corrections. This limitation necessitates the development of more efficient maneuvering techniques to ensure mission success. In this paper, we propose a maneuvering sequence that exploits the natural J2 perturbation caused by the Earth's oblateness. By utilizing the secular effects of this perturbation, it is possible to passively influence key orbital parameters such as the argument of perigee and the right ascension of the ascending node, thereby reducing the need for extensive propulsion-based corrections. The approach is designed to optimize the CubeSat's orbital insertion and minimize the total fuel required for trajectory adjustments, making it particularly suitable for fuel-constrained missions. The proposed methodology is validated through comprehensive numerical simulations that examine different initial orbital conditions and perturbation environments. Case studies are presented to demonstrate the effectiveness of the J2-augmented strategy in achieving accurate orbital insertion, showing a major reduction in fuel consumption compared to traditional methods. The results underscore the potential of this approach to extend the operational life and capabilities of CubeSats, offering a viable solution for future low-Earth orbit missions.
Advancing Seasonal Prediction of Tropical Cyclone Activity with a Hybrid AI-Physics Climate Model
Zhang, Gan, Rao, Megha, Yuval, Janni, Zhao, Ming
Machine learning (ML) models are successful with weather forecasting and have shown progress in climate simulations, yet leveraging them for useful climate predictions needs exploration. Here we show this feasibility using Neural General Circulation Model (NeuralGCM), a hybrid ML-physics atmospheric model developed by Google, for seasonal predictions of large-scale atmospheric variability and Northern Hemisphere tropical cyclone (TC) activity. Inspired by physical model studies, we simplify boundary conditions, assuming sea surface temperature (SST) and sea ice follow their climatological cycle but persist anomalies present at the initialization time. With such forcings, NeuralGCM can generate 100 simulation days in ~8 minutes with a single Graphics Processing Unit (GPU), while simulating realistic atmospheric circulation and TC climatology patterns. This configuration yields useful seasonal predictions (July to November) for the tropical atmosphere and various TC activity metrics. Notably, the predicted and observed TC frequency in the North Atlantic and East Pacific basins are significantly correlated during 1990 to 2023 (r=~0.7), suggesting prediction skill comparable to existing physical GCMs. Despite challenges associated with model resolution and simplified boundary forcings, the model-predicted interannual variations demonstrate significant correlations with the observation, including the sub-basin TC tracks (p<0.1) and basin-wide accumulated cyclone energy (p<0.01) of the North Atlantic and North Pacific basins. These findings highlight the promise of leveraging ML models with physical insights to model TC risks and deliver seamless weather-climate predictions.
An ultra-low-power CGRA for accelerating Transformers at the edge
Transformers have revolutionized deep learning with applications in natural language processing, computer vision, and beyond. However, their computational demands make it challenging to deploy them on low-power edge devices. This paper introduces an ultra-low-power, Coarse-Grained Reconfigurable Array (CGRA) architecture specifically designed to accelerate General Matrix Multiplication (GEMM) operations in transformer models tailored for the energy and resource constraints of edge applications. The proposed architecture integrates a 4 x 4 array of Processing Elements (PEs) for efficient parallel computation and dedicated 4 x 2 Memory Operation Blocks (MOBs) for optimized LOAD/STORE operations, reducing memory bandwidth demands and enhancing data reuse. A switchless mesh torus interconnect network further minimizes power and latency by enabling direct communication between PEs and MOBs, eliminating the need for centralized switching. Through its heterogeneous array design and efficient dataflow, this CGRA architecture addresses the unique computational needs of transformers, offering a scalable pathway to deploy sophisticated machine learning models on edge devices.
FLDmamba: Integrating Fourier and Laplace Transform Decomposition with Mamba for Enhanced Time Series Prediction
Zhang, Qianru, Yu, Chenglei, Wang, Haixin, Yan, Yudong, Cao, Yuansheng, Yiu, Siu-Ming, Wu, Tailin, Yin, Hongzhi
-- Time series prediction, a crucial task across various domains, faces significant challenges due to the inherent complexities of time series data, including non-stationarity, multi-scale periodicity, and transient dynamics, particularly when tackling long-term predictions. While Transformer-based architectures have shown promise, their quadratic complexity with sequence length hinders their efficiency for long-term predictions. Meanwhile, they are susceptible to data noise issues in time series. This paper proposes a novel framework, FLDmamba (Fourier and Laplace Transform Decomposition Mamba), addressing these limitations. FLDmamba leverages the strengths of both Fourier and Laplace transforms to effectively capture both multi-scale periodicity, transient dynamics within time series data, and improve the robustness of the model to the data noise issue. Our extensive experiments demonstrate that FLDmamba achieves superior performance on time series prediction benchmarks, outperforming both Transformer-based and other Mamba-based architectures. IME series prediction, which forecasts the future values of a (multivariate) variable based on its historical values, finds its application across a wide range of fields. Examples include weather prediction [1, 2], power grid management [3], traffic prediction [4, 5, 6, 7, 8, 9, 10], and stock market [11, 12, 13, 14], to name just a few. Despite significant advancements in this domain, the inherent complexities of time series data, such as non-stationarity, multi-scale periodicity, intrinsic stochasticity, and noise, pose substantial challenges to existing predictive models in long-term prediction. Q. Zhang and S.M. Yiu are from the University of Hong Kong.