Energy
Joint AoI and Handover Optimization in Space-Air-Ground Integrated Network
Lang, Zifan, Liu, Guixia, Sun, Geng, Li, Jiahui, Wang, Jiacheng, Yuan, Weijie, Niyato, Dusit, Kim, Dong In
Despite the widespread deployment of terrestrial networks, providing reliable communication services to remote areas and maintaining connectivity during emergencies remains challenging. Low Earth orbit (LEO) satellite constellations offer promising solutions with their global coverage capabilities and reduced latency, yet struggle with intermittent coverage and limited communication windows due to orbital dynamics. This paper introduces an age of information (AoI)-aware space-air-ground integrated network (SAGIN) architecture that leverages a high-altitude platform (HAP) as intelligent relay between the LEO satellites and ground terminals. Our three-layer design employs hybrid free-space optical (FSO) links for high-capacity satellite-to-HAP communication and reliable radio frequency (RF) links for HAP-to-ground transmission, and thus addressing the temporal discontinuity in LEO satellite coverage while serving diverse user priorities. Specifically, we formulate a joint optimization problem to simultaneously minimize the AoI and satellite handover frequency through optimal transmit power distribution and satellite selection decisions. This highly dynamic, non-convex problem with time-coupled constraints presents significant computational challenges for traditional approaches. To address these difficulties, we propose a novel diffusion model (DM)-enhanced dueling double deep Q-network with action decomposition and state transformer encoder (DD3QN-AS) algorithm that incorporates transformer-based temporal feature extraction and employs a DM-based latent prompt generative module to refine state-action representations through conditional denoising. Simulation results highlight the superior performance of the proposed approach compared with policy-based methods and some other deep reinforcement learning (DRL) benchmarks.
MFAF: An EVA02-Based Multi-scale Frequency Attention Fusion Method for Cross-View Geo-Localization
Liu, YiTong, Liu, TianZhu, GU, YanFeng
Cross-view geo-localization aims to determine the geographical location of a query image by matching it against a gallery of images. This task is challenging due to the significant appearance variations of objects observed from variable views, along with the difficulty in extracting discriminative features. Existing approaches often rely on extracting features through feature map segmentation while neglecting spatial and semantic information. To address these issues, we propose the EVA02-based Multi-scale Frequency Attention Fusion (MFAF) method. The MFAF method consists of Multi-Frequency Branch-wise Block (MFB) and the Frequency-aware Spatial Attention (FSA) module. The MFB block effectively captures both low-frequency structural features and high-frequency edge details across multiple scales, improving the consistency and robustness of feature representations across various viewpoints. Meanwhile, the FSA module adaptively focuses on the key regions of frequency features, significantly mitigating the interference caused by background noise and viewpoint variability. Extensive experiments on widely recognized benchmarks, including University-1652, SUES-200, and Dense-UAV, demonstrate that the MFAF method achieves competitive performance in both drone localization and drone navigation tasks.
PerchMobi^3: A Multi-Modal Robot with Power-Reuse Quad-Fan Mechanism for Air-Ground-Wall Locomotion
Chen, Yikai, Zheng, Zhi, Wang, Jin, He, Bingye, Xu, Xiangyu, Zhang, Jialu, Yu, Huan, Lu, Guodong
Achieving seamless integration of aerial flight, ground driving, and wall climbing within a single robotic platform remains a major challenge, as existing designs often rely on additional adhesion actuators that increase complexity, reduce efficiency, and compromise reliability. To address these limitations, we present PerchMobi^3, a quad-fan, negative-pressure, air-ground-wall robot that implements a propulsion-adhesion power-reuse mechanism. By repurposing four ducted fans to simultaneously provide aerial thrust and negative-pressure adhesion, and integrating them with four actively driven wheels, PerchMobi^3 eliminates dedicated pumps while maintaining a lightweight and compact design. To the best of our knowledge, this is the first quad-fan prototype to demonstrate functional power reuse for multi-modal locomotion. A modeling and control framework enables coordinated operation across ground, wall, and aerial domains with fan-assisted transitions. The feasibility of the design is validated through a comprehensive set of experiments covering ground driving, payload-assisted wall climbing, aerial flight, and cross-mode transitions, demonstrating robust adaptability across locomotion scenarios. These results highlight the potential of PerchMobi^3 as a novel design paradigm for multi-modal robotic mobility, paving the way for future extensions toward autonomous and application-oriented deployment.
Zero to Autonomy in Real-Time: Online Adaptation of Dynamics in Unstructured Environments
Ward, William, Etter, Sarah, Quattrociocchi, Jesse, Ellis, Christian, Thorpe, Adam J., Topcu, Ufuk
Abstract--Autonomous robots must go from zero prior knowledge to safe control within seconds to operate in unstructured environments. Abrupt terrain changes, such as a sudden transition to ice, create dynamics shifts that can destabilize planners unless the model adapts in real-time. We present a method for online adaptation that combines function encoders with recursive least squares, treating the function encoder coefficients as latent states updated from streaming odometry. We evaluate our approach on a V an der Pol system to highlight algorithmic behavior, in a Unity simulator for high-fidelity off-road navigation, and on a Clearpath Jackal robot, including on a challenging terrain at a local ice rink. Across these settings, our method improves model accuracy and downstream planning, reducing collisions compared to static and meta-learning baselines. High-speed ground vehicles require dynamics models that evolve as quickly as the terrain itself. When operating near the limits of controllability, even modest prediction errors in ground terrain interaction can lead to instability, skidding, or rollover. This problem is particularly apparent in off-road navigation: transitions such as pavement to loose gravel can change friction properties within seconds, while mixed terrain features introduce variation in the terrain properties that are difficult to accurately predict. Planning frameworks such as Model Predictive Path Integral Control (MPPI) [27] rely on an accurate model of the system dynamics to predict rollouts and select optimal control actions in real-time.
Finite-Agent Stochastic Differential Games on Large Graphs: II. Graph-Based Architectures
Hu, Ruimeng, Long, Jihao, Zhou, Haosheng
We propose a novel neural network architecture, called Non-Trainable Modification (NTM), for computing Nash equilibria in stochastic differential games (SDGs) on graphs. These games model a broad class of graph-structured multi-agent systems arising in finance, robotics, energy, and social dynamics, where agents interact locally under uncertainty. The NTM architecture imposes a graph-guided sparsification on feedforward neural networks, embedding fixed, non-trainable components aligned with the underlying graph topology. This design enhances interpretability and stability, while significantly reducing the number of trainable parameters in large-scale, sparse settings. We theoretically establish a universal approximation property for NTM in static games on graphs and numerically validate its expressivity and robustness through supervised learning tasks. Building on this foundation, we incorporate NTM into two state-of-the-art game solvers, Direct Parameterization and Deep BSDE, yielding their sparse variants (NTM-DP and NTM-DBSDE). Numerical experiments on three SDGs across various graph structures demonstrate that NTM-based methods achieve performance comparable to their fully trainable counterparts, while offering improved computational efficiency.
Explainable Unsupervised Multi-Anomaly Detection and Temporal Localization in Nuclear Times Series Data with a Dual Attention-Based Autoencoder
Vasili, Konstantinos, Dahm, Zachery T., Chatzidakis, Stylianos
The nuclear industry is advancing toward more new reactor designs, with next-generation reactors expected to be smaller in scale and power output. These systems have the potential to produce large volumes of information in the form of multivariate time-series data, which could be used for enhanced real-time monitoring and control. In this context, the development of remote autonomous or semi-autonomous control systems for reactor operation has gained significant interest. A critical first step toward such systems is an accurate diagnostics module capable of detecting and localizing anomalies within the reactor system. Recent studies have proposed various ML and DL approaches for anomaly detection in the nuclear domain. Despite promising results, key challenges remain, including limited to no explainability, lack of access to real-world data, and scarcity of abnormal events, which impedes benchmarking and characterization. Most existing studies treat these methods as black boxes, while recent work highlights the need for greater interpretability of ML/DL outputs in safety-critical domains. Here, we propose an unsupervised methodology based on an LSTM autoencoder with a dual attention mechanism for characterization of abnormal events in a real-world reactor radiation area monitoring system. The framework includes not only detection but also localization of the event and was evaluated using real-world datasets of increasing complexity from the PUR-1 research reactor. The attention mechanisms operate in both the feature and temporal dimensions, where the feature attention assigns weights to radiation sensors exhibiting abnormal patterns, while time attention highlights the specific timesteps where irregularities occur, thus enabling localization. By combining the results, the framework can identify both the affected sensors and the duration of each anomaly within a single unified network.
Meta-model Neural Process for Probabilistic Power Flow under Varying N-1 System Topologies
Ly, Sel, Chauhan, Kapil, Singh, Anshuman, Nguyen, Hung Dinh
The probabilistic power flow (PPF) problem is essential to quantifying the distribution of the nodal voltages due to uncertain injections. The conventional PPF problem considers a fixed topology, and the solutions to such a PPF problem are associated with this topology. A change in the topology might alter the power flow patterns and thus require the PPF problem to be solved again. The previous PPF model and its solutions are no longer valid for the new topology. This practice incurs both inconvenience and computation burdens as more contingencies are foreseen due to high renewables and a large share of electric vehicles. This paper presents a novel topology-adaptive approach, based on the meta-model Neural Process (MMNP), for finding the solutions to PPF problems under varying N-1 topologies, particularly with one-line failures. By leveraging context set-based topology representation and conditional distribution over function learning techniques, the proposed MMNP enhances the robustness of PPF models to topology variations, mitigating the need for retraining PPF models on a new configuration. Simulations on an IEEE 9-bus system and IEEE 118-bus system validate the model's performance. The maximum %L1-relative error norm was observed as 1.11% and 0.77% in 9-bus and 118-bus, respectively. This adaptive approach fills a critical gap in PPF methodology in an era of increasing grid volatility.
Developing an aeroponic smart experimental greenhouse for controlling irrigation and plant disease detection using deep learning and IoT
Narimani, Mohammadreza, Hajiahmad, Ali, Moghimi, Ali, Alimardani, Reza, Rafiee, Shahin, Mirzabe, Amir Hossein
Controlling environmental conditions and monitoring plant status in greenhouses is critical to promptly making appropriate management decisions aimed at promoting crop production. The primary objective of this research study was to develop and test a smart aeroponic greenhouse on an experimental scale where the status of Geranium plant and environmental conditions are continuously monitored through the integration of the internet of things (IoT) and artificial intelligence (AI). An IoT-based platform was developed to control the environmental conditions of plants more efficiently and provide insights to users to make informed management decisions. In addition, we developed an AI-based disease detection framework using VGG-19, InceptionResNetV2, and InceptionV3 algorithms to analyze the images captured periodically after an intentional inoculation. The performance of the AI framework was compared with an expert's evaluation of disease status. Preliminary results showed that the IoT system implemented in the greenhouse environment is able to publish data such as temperature, humidity, water flow, and volume of charge tanks online continuously to users and adjust the controlled parameters to provide an optimal growth environment for the plants. Furthermore, the results of the AI framework demonstrate that the VGG-19 algorithm was able to identify drought stress and rust leaves from healthy leaves with the highest accuracy, 92% among the other algorithms.
LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences
Yuan, Liangqi, Han, Dong-Jun, Brinton, Christopher G., Brunswicker, Sabine
The rise of large language models (LLMs) has made natural language-driven route planning an emerging research area that encompasses rich user objectives. Current research exhibits two distinct approaches: direct route planning using LLM-as-Agent and graph-based searching strategies. However, LLMs in the former approach struggle to handle extensive map data, while the latter shows limited capability in understanding natural language preferences. Additionally, a more critical challenge arises from the highly heterogeneous and unpredictable spatio-temporal distribution of users across the globe. In this paper, we introduce a novel LLM-Assisted route Planning (LLMAP) system that employs an LLM-as-Parser to comprehend natural language, identify tasks, and extract user preferences and recognize task dependencies, coupled with a Multi-Step Graph construction with iterative Search (MSGS) algorithm as the underlying solver for optimal route finding. Our multi-objective optimization approach adaptively tunes objective weights to maximize points of interest (POI) quality and task completion rate while minimizing route distance, subject to three key constraints: user time limits, POI opening hours, and task dependencies. We conduct extensive experiments using 1,000 routing prompts sampled with varying complexity across 14 countries and 27 cities worldwide. The results demonstrate that our approach achieves superior performance with guarantees across multiple constraints.
Research on Short-Video Platform User Decision-Making via Multimodal Temporal Modeling and Reinforcement Learning
Wang, Jinmeiyang, Dong, Jing, Zhou, Li
This paper proposes the MT-DQN model, which integrates a Transformer, Temporal Graph Neural Network (TGNN), and Deep Q-Network (DQN) to address the challenges of predicting user behavior and optimizing recommendation strategies in short-video environments. Experiments demonstrated that MT-DQN consistently outperforms traditional concatenated models, such as Concat-Modal, achieving an average F1-score improvement of 10.97% and an average NDCG@5 improvement of 8.3%. Compared to the classic reinforcement learning model Vanilla-DQN, MT-DQN reduces MSE by 34.8% and MAE by 26.5%. Nonetheless, we also recognize challenges in deploying MT-DQN in real-world scenarios, such as its computational cost and latency sensitivity during online inference, which will be addressed through future architectural optimization.