Energy
Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling
Jadhav, Prachi, Jin, Hongwei, Deelman, Ewa, Balaprakash, Prasanna
High-Performance Computing (HPC) job scheduling involves balancing conflicting objectives such as minimizing makespan, reducing wait times, optimizing resource use, and ensuring fairness. Traditional methods, including heuristic-based, e.g., First-Come-First-Served (FJFS) and Shortest Job First (SJF), or intensive optimization techniques, often lack adaptability to dynamic workloads and, more importantly, cannot simultaneously optimize multiple objectives in HPC systems. To address this, we propose a novel Large Language Model (LLM)-based scheduler using a ReAct-style framework (Reason + Act), enabling iterative, interpretable decision-making. The system incorporates a scratchpad memory to track scheduling history and refine decisions via natural language feedback, while a constraint enforcement module ensures feasibility and safety. We evaluate our approach using OpenAI's O4-Mini and Anthropic's Claude 3.7 across seven real-world HPC workload scenarios, including heterogeneous mixes, bursty patterns, and adversarial cases etc. Comparisons against FCFS, SJF, and Google OR-Tools (on 10 to 100 jobs) reveal that LLM-based scheduling effectively balances multiple objectives while offering transparent reasoning through natural language traces. The method excels in constraint satisfaction and adapts to diverse workloads without domain-specific training. However, a trade-off between reasoning quality and computational overhead challenges real-time deployment. This work presents the first comprehensive study of reasoning-capable LLMs for HPC scheduling, demonstrating their potential to handle multiobjective optimization while highlighting limitations in computational efficiency. The findings provide insights into leveraging advanced language models for complex scheduling problems in dynamic HPC environments.
Optimization of Module Transferability in Single Image Super-Resolution: Universality Assessment and Cycle Residual Blocks
Cheng, Haotong, Zhang, Zhiqi, Li, Hao, Zhang, Xinshang
Deep learning has substantially advanced the field of Single Image Super-Resolution (SISR). However, existing research has predominantly focused on raw performance gains, with little attention paid to quantifying the transferability of architectural components. In this paper, we introduce the concept of "Universality" and its associated definitions, which extend the traditional notion of "Generalization" to encompass the ease of transferability of modules. We then propose the Universality Assessment Equation (UAE), a metric that quantifies how readily a given module can be transplanted across models and reveals the combined influence of multiple existing metrics on transferability. Guided by the UAE results of standard residual blocks and other plug-and-play modules, we further design two optimized modules: the Cycle Residual Block (CRB) and the Depth-Wise Cycle Residual Block (DCRB). Through comprehensive experiments on natural-scene benchmarks, remote-sensing datasets, and other low-level tasks, we demonstrate that networks embedded with the proposed plug-and-play modules outperform several state-of-the-art methods, achieving a PSNR improvement of up to 0.83 dB or enabling a 71.3% reduction in parameters with negligible loss in reconstruction fidelity. Similar optimization approaches could be applied to a broader range of basic modules, offering a new paradigm for the design of plug-and-play modules.
Revealing the empirical flexibility of gas units through deep clustering
Bassini, Chiara Fusar, Xu, Alice Lixuan, Canales, Jorge Sรกnchez, Hirth, Lion, Kaack, Lynn H.
The flexibility of a power generation unit determines how quickly and often it can ramp up or down. In energy models, it depends on assumptions on the technical characteristics of the unit, such as its installed capacity or turbine technology. In this paper, we learn the empirical flexibility of gas units from their electricity generation, revealing how real-world limitations can lead to substantial differences between units with similar technical characteristics. Using a novel deep clustering approach, we transform 5 years (2019-2023) of unit-level hourly generation data for 49 German units from 100 MWp of installed capacity into low-dimensional embeddings. Our unsupervised approach identifies two clusters of peaker units (high flexibility) and two clusters of non-peaker units (low flexibility). The estimated ramp rates of non-peakers, which constitute half of the sample, display a low empirical flexibility, comparable to coal units. Non-peakers, predominantly owned by industry and municipal utilities, show limited response to low residual load and negative prices, generating on average 1.3 GWh during those hours. As the transition to renewables increases market variability, regulatory changes will be needed to unlock this flexibility potential.
Closed-Loop Neural Operator-Based Observer of Traffic Density
Harting, Alice, Johansson, Karl Henrik, Barreau, Matthieu
-- We consider the problem of traffic density estimation with sparse measurements from stationary roadside sensors. Our approach uses Fourier neural operators to learn macroscopic traffic flow dynamics from high-fidelity data. T o close the loop, we couple the open-loop operator with a correction operator that combines the predicted density with sparse measurements from the sensors. Simulations with the SUMO software indicate that, compared to open-loop observers, the proposed closed-loop observer exhibits classical closed-loop properties such as robustness to noise and ultimate boundedness of the error . This shows the advantages of combining learned physics with real-time corrections, and opens avenues for accurate, efficient, and interpretable data-driven observers.
SAFE--MA--RRT: Multi-Agent Motion Planning with Data-Driven Safety Certificates
Esmaeili, Babak, Modares, Hamidreza
This paper proposes a fully data-driven motion-planning framework for homogeneous linear multi-agent systems that operate in shared, obstacle-filled workspaces without access to explicit system models. Each agent independently learns its closed-loop behavior from experimental data by solving convex semidefinite programs that generate locally invariant ellipsoids and corresponding state-feedback gains. These ellipsoids, centered along grid-based waypoints, certify the dynamic feasibility of short-range transitions and define safe regions of operation. A sampling-based planner constructs a tree of such waypoints, where transitions are allowed only when adjacent ellipsoids overlap, ensuring invariant-to-invariant transitions and continuous safety. All agents expand their trees simultaneously and are coordinated through a space-time reservation table that guarantees inter-agent safety by preventing simultaneous occupancy and head-on collisions. Each successful edge in the tree is equipped with its own local controller, enabling execution without re-solving optimization problems at runtime. The resulting trajectories are not only dynamically feasible but also provably safe with respect to both environmental constraints and inter-agent collisions. Simulation results demonstrate the effectiveness of the approach in synthesizing synchronized, safe trajectories for multiple agents under shared dynamics and constraints, using only data and convex optimization tools.
Sailing Towards Zero-Shot State Estimation using Foundation Models Combined with a UKF
Holtmann, Tobin, Stenger, David, Posada-Moreno, Andres, Solowjow, Friedrich, Trimpe, Sebastian
State estimation in control and systems engineering traditionally requires extensive manual system identification or data-collection effort. However, transformer-based foundation models in other domains have reduced data requirements by leveraging pre-trained generalist models. Ultimately, developing zero-shot foundation models of system dynamics could drastically reduce manual deployment effort. While recent work shows that transformer-based end-to-end approaches can achieve zero-shot performance on unseen systems, they are limited to sensor models seen during training. We introduce the foundation model unscented Kalman filter (FM-UKF), which combines a transformer-based model of system dynamics with analytically known sensor models via an UKF, enabling generalization across varying dynamics without retraining for new sensor configurations. We evaluate FM-UKF on a new benchmark of container ship models with complex dynamics, demonstrating a competitive accuracy, effort, and robustness trade-off compared to classical methods with approximate system knowledge and to an end-to-end approach. The benchmark and dataset are open sourced to further support future research in zero-shot state estimation via foundation models.
Multimodal Feature Fusion Network with Text Difference Enhancement for Remote Sensing Change Detection
Zhou, Yijun, Zhai, Yikui, Ying, Zilu, Xian, Tingfeng, Zhou, Wenlve, Zhou, Zhiheng, Tian, Xiaolin, Jia, Xudong, Zhang, Hongsheng, Chen, C. L. Philip
--Although deep learning has advanced remote sensing change detection (RSCD), most methods rely solely on image modality, limiting feature representation, change pattern modeling, and generalization--especially under illumination and noise disturbances. T o address this, we propose MMChange, a multimodal RSCD method that combines image and text modalities to enhance accuracy and robustness. An Image Feature Refinement (IFR) module is introduced to highlight key regions and suppress environmental noise. T o overcome the semantic limitations of image features, we employ a vision-language model (VLM) to generate semantic descriptions of bi-temporal images. T o bridge the heterogeneity between modalities, we design an Image-T ext Feature Fusion (ITFF) module that enables deep cross-modal integration. Extensive experiments on LEVIR-CD, WHU-CD, and SYSU-CD demonstrate that MMChange consistently surpasses state-of-the-art methods across multiple metrics, validating its effectiveness for multimodal RSCD. Yijun Zhou, Yikui Zhai, Zilu Ying and Tingfeng Xian are with the College of Electronics and Information Engineering, Wuyi University, Jiang-men, 529020, China(e-mail: 17346700814@163.com, Wenlve Zhou, Zhiheng Zhou are with the School of Electronic and Information Engineering and the Key Laboratory of Big Data and Intelligent Robot, Ministry of Education, South China University of Technology, Guangzhou, Guangdong 510641, China (e-mail: wenlvezhou@163.com; Xiaolin Tian are with the State Key Laboratory of Lunar and Planetary Sciences, Macau University of Science and Technology, Taipa, Macau (email:xltian@must.edu.mo). Xudong Jia is the College of Engineering and Computer Science, California State University, Northridge, 18111, America (e-mail: Xudong.Jia@csun.edu). Hongsheng Zhang is with the Department of Geography, The University of Hong Kong, Hong Kong, China (e-mail: zhanghs@hku.hk). C. L. Philip Chen is with the Faculty of Computer Science and Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China (e-mail: philip.chen@ieee.org).
Mapping on a Budget: Optimizing Spatial Data Collection for ML
Betti, Livia, Sanni, Farooq, Sogoyou, Gnouyaro, Agbagla, Togbe, Molitor, Cullen, Carleton, Tamma, Rolf, Esther
In applications across agriculture, ecology, and human development, machine learning with satellite imagery (SatML) is limited by the sparsity of labeled training data. While satellite data cover the globe, labeled training datasets for SatML are often small, spatially clustered, and collected for other purposes (e.g., administrative surveys or field measurements). Despite the pervasiveness of this issue in practice, past SatML research has largely focused on new model architectures and training algorithms to handle scarce training data, rather than modeling data conditions directly. This leaves scientists and policymakers who wish to use SatML for large-scale monitoring uncertain about whether and how to collect additional data to maximize performance. Here, we present the first problem formulation for the optimization of spatial training data in the presence of heterogeneous data collection costs and realistic budget constraints, as well as novel methods for addressing this problem. In experiments simulating different problem settings across three continents and four tasks, our strategies reveal substantial gains from sample optimization. Further experiments delineate settings for which optimized sampling is particularly effective. The problem formulation and methods we introduce are designed to generalize across application domains for SatML; we put special emphasis on a specific problem setting where our coauthors can immediately use our findings to augment clustered agricultural surveys for SatML monitoring in Togo.
Ukraine knocks out Russian refineries as Russia kills dozens in Kyiv
Ukraine has pounded Russia's refineries with deep strikes in the past week, worsening its petrol shortages and causing Moscow to extend a ban on exports of petroleum products. Russia responded with a deadly attack on Kyiv and a barrage of statements portraying Moscow's "special military operation" as a success and Ukraine as teetering on the edge of surrender. Russian President Vladimir Putin used his appearance at the Shanghai Cooperation Organisation (SCO) summit to suggest Russian energy exports to China and India were booming, but reports suggested that Moscow is heavily discounting its crude to hold onto clients. Ukraine's European and regional allies are meeting on Thursday to try to finalise security guarantees in case a ceasefire should come about, while calling on US President Donald Trump to use sanctions to press Putin into direct negotiations with Kyiv. Russia redeployed marines and paratroopers โ elite units โ from Ukraine's northern Sumy region to the eastern region of Donetsk on September 1, suggesting it may be preparing a renewed push for the city of Pokrovsk, which Ukraine has identified as a Russian key tactical objective in the east since August 2024.
The Download: unnerving AI avatars, and Trump's climate gift to China
Earlier this summer, I visited the AI company Synthesia to give it what it needed to create a hyperrealistic AI-generated avatar of me. The company's avatars are a decent barometer of just how dizzying progress has been in AI over the past few years, so I was curious just how accurately its latest AI model, introduced last month, could replicate me. I found my avatar as unnerving as it is technically impressive. It's slick enough to pass as a high-definition recording of a chirpy corporate speech, and if you didn't know me, you'd probably think that's exactly what it was. My avatar shows how it's becoming ever-harder to distinguish the artificial from the real.