Energy
Choosing to Be Green: Advancing Green AI via Dynamic Model Selection
Cruciani, Emilio, Verdecchia, Roberto
Artificial Intelligence is increasingly pervasive across domains, with ever more complex models delivering impressive predictive performance. This fast technological advancement however comes at a concerning environmental cost, with state-of-the-art models - particularly deep neural networks and large language models - requiring substantial computational resources and energy. In this work, we present the intuition of Green AI dynamic model selection, an approach based on dynamic model selection that aims at reducing the environmental footprint of AI by selecting the most sustainable model while minimizing potential accuracy loss. Specifically, our approach takes into account the inference task, the environmental sustainability of available models, and accuracy requirements to dynamically choose the most suitable model. Our approach presents two different methods, namely Green AI dynamic model cascading and Green AI dynamic model routing. We demonstrate the effectiveness of our approach via a proof of concept empirical example based on a real-world dataset. Our results show that Green AI dynamic model selection can achieve substantial energy savings (up to ~25%) while substantially retaining the accuracy of the most energy greedy solution (up to ~95%). As conclusion, our preliminary findings highlight the potential that hybrid, adaptive model selection strategies withhold to mitigate the energy demands of modern AI systems without significantly compromising accuracy requirements.
From Samples to Scenarios: A New Paradigm for Probabilistic Forecasting
Dai, Xilin, Xu, Zhijian, Cai, Wanxu, Xu, Qiang
Most state-of-the-art probabilistic time series forecasting models rely on sampling to represent future uncertainty. However, this paradigm suffers from inherent limitations, such as lacking explicit probabilities, inadequate coverage, and high computational costs. In this work, we introduce \textbf{Probabilistic Scenarios}, an alternative paradigm designed to address the limitations of sampling. It operates by directly producing a finite set of \{Scenario, Probability\} pairs, thus avoiding Monte Carlo-like approximation. To validate this paradigm, we propose \textbf{TimePrism}, a simple model composed of only three parallel linear layers. Surprisingly, TimePrism achieves 9 out of 10 state-of-the-art results across five benchmark datasets on two metrics. The effectiveness of our paradigm comes from a fundamental reframing of the learning objective. Instead of modeling an entire continuous probability space, the model learns to represent a set of plausible scenarios and corresponding probabilities. Our work demonstrates the potential of the Probabilistic Scenarios paradigm, opening a promising research direction in forecasting beyond sampling.
Formal Safety Verification and Refinement for Generative Motion Planners via Certified Local Stabilization
Nath, Devesh, Yin, Haoran, Chou, Glen
Abstract--We present a method for formal safety verification of learning-based generative motion planners. Generative motion planners (GMPs) offer advantages over traditional planners, but verifying the safety and dynamic feasibility of their outputs is difficult since neural network verification (NNV) tools scale only to a few hundred neurons, while GMPs often contain millions. T o preserve GMP expressiveness while enabling verification, our key insight is to imitate the GMP by stabilizing references sampled from the GMP with a small neural tracking controller and then applying NNV to the closed-loop dynamics. This yields reachable sets that rigorously certify closed-loop safety, while the controller enforces dynamic feasibility. Building on this, we construct a library of verified GMP references and deploy them online in a way that imitates the original GMP distribution whenever it is safe to do so, improving safety without retraining. We evaluate across diverse planners, including diffusion, flow matching, and vision-language models, improving safety in simulation (on ground robots and quadcopters) and on hardware (differential-drive robot). Motion planning has been transformed by generative models like diffusion and conditional flow matching (CFM) [1], [2], which learn multimodal trajectory distributions and enable generative motion planners (GMPs) that produce diverse plans from inputs like language or images [3]-[6].
Where 6G Stands Today: Evolution, Enablers, and Research Gaps
Tika, Salma, Haqiq, Abdelkrim, Sabir, Essaid, Driouch, Elmahdi
Abstract--As the fifth-generation (5G) mobile communication system continues its global deployment, both industry and academia have started conceptualizing the 6th generation (6G) to address the growing need for a progressively advanced and digital society. Even while 5G offers considerable advancements over L TE, it could struggle to be sufficient to meet all of the requirements, including ultra-high reliability, seamless automation, and ubiquitous coverage. In response, 6G is supposed to bring out a highly intelligent, automated, and ultra-reliable communication system that can handle a vast number of connected devices. This paper offers a comprehensive overview of 6G, beginning with its main stringent requirements while focusing on key enabling technologies such as terahertz (THz) communications, intelligent reflecting surfaces, massive MIMO and AI-driven networking that will shape the 6G networks. Furthermore, the paper lists various 6G applications and usage scenarios that will benefit from these advancements. At the end, we outline the potential challenges that must be addressed to achieve the 6G promises. Keywords-- 6 G, Usage Scenarios, Capabilities, Enabling technologies, Challenges. I. INTRODUCTION The wireless industry has continuously evolved and it is is one of the few industry sectors that have kept a fast-growing trend, with each generation introducing higher frequencies, larger bandwidths, and faster data rates [1]. Since Marconi's wireless telegraphy in the 19th century, mobile networks have advanced from 1G's basic voice services to 5G's ultra-high-definition 3D data transmission. Researchers are currently focusing on 6G as 5G deployment expands throughout the world and is anticipated to be realized by 2030.
Are We Scaling the Right Thing? A System Perspective on Test-Time Scaling
Zhao, Youpeng, LV, Jinpeng, Wu, Di, Wang, Jun, Gooley, Christopher
Test-time scaling (TTS) has recently emerged as a promising direction to exploit the hidden reasoning capabilities of pre-trained large language models (LLMs). However, existing scaling methods narrowly focus on the compute-optimal Pareto-frontier, ignoring the simple fact that compute-optimal is not always system-optimal. In this work, we propose a system-driven perspective on TTS, analyzing how reasoning models scale against practical metrics, such as latency and cost-per-token. By evaluating the impact of popular optimizations such as tensor parallelism and speculative decoding, our preliminary analysis reveals the limitations of current methods and calls for a paradigm shift toward holistic, system-aware evaluations that capture the true essence of scaling laws at inference time.
Transformer Modeling for Both Scalability and Performance in Multivariate Time Series
Variable count is among the main scalability bottlenecks for transformer modeling in multivariate time series (MTS) data. On top of this, a growing consensus in the field points to indiscriminate inter-variable mixing as a potential source of noise-accumulation and performance degradation. This is likely exacerbated by sparsity of informative signals characteristic of many MTS systems coupled with representational misalignment stemming from indiscriminate information mixing between (heterogeneous) variables. While scalability and performance are often seen as competing interests in transformer design, we show that both can be improved simultaneously in MTS by strategically constraining the representational capacity of inter-variable mixing. Our proposed method, transformer with Delegate Token Attention (DELTAformer), constrains inter-variable modeling through what we call delegate tokens which are then used to perform full, unconstrained, inter-temporal modeling. Delegate tokens act as an implicit regularizer that forces the model to be highly selective about what inter-variable information is allowed to propagate through the network. Our results show that DELTAformer scales linearly with variable-count while actually outperforming standard transformers, achieving state-of-the-art performance across benchmarks and baselines. In addition, DELTAformer can focus on relevant signals better than standard transformers in noisy MTS environments and overall exhibit superior noise-resilience. Overall, results across various experiments confirm that by aligning our model design to leverage domain-specific challenges in MTS to our advantage, DELTAformer can simultaneously achieve linear scaling while actually improving its performance against standard, quadratic transformers.
OmniFed: A Modular Framework for Configurable Federated Learning from Edge to HPC
Tyagi, Sahil, Cozma, Andrei, Kotevska, Olivera, Wang, Feiyi
Federated Learning (FL) is critical for edge and High Performance Computing (HPC) where data is not centralized and privacy is crucial. We present OmniFed, a modular framework designed around decoupling and clear separation of concerns for configuration, orchestration, communication, and training logic. Its architecture supports configuration-driven prototyping and code-level override-what-you-need customization. We also support different topologies, mixed communication protocols within a single deployment, and popular training algorithms. It also offers optional privacy mechanisms including Differential Privacy (DP), Homomorphic Encryption (HE), and Secure Aggregation (SA), as well as compression strategies. These capabilities are exposed through well-defined extension points, allowing users to customize topology and orchestration, learning logic, and privacy/compression plugins, all while preserving the integrity of the core system. We evaluate multiple models and algorithms to measure various performance metrics. By unifying topology configuration, mixed-protocol communication, and pluggable modules in one stack, OmniFed streamlines FL deployment across heterogeneous environments. Github repository is available at https://github.com/at-aaims/OmniFed.
Short-Term Regional Electricity Demand Forecasting in Argentina Using LSTM Networks
This study presents the development and optimization of a deep learning model based on Long Short-Term Memory (LSTM) networks to predict short-term hourly electricity demand in Cรณrdoba, Argentina. Integrating historical consumption data with exogenous variables (climatic factors, temporal cycles, and demographic statistics), the model achieved high predictive precision, with a mean absolute percentage error of 3.20\% and a determination coefficient of 0.95. The inclusion of periodic temporal encodings and weather variables proved crucial to capture seasonal patterns and extreme consumption events, enhancing the robustness and generalizability of the model. In addition to the design and hyperparameter optimization of the LSTM architecture, two complementary analyses were carried out: (i) an interpretability study using Random Forest regression to quantify the relative importance of exogenous drivers, and (ii) an evaluation of model performance in predicting the timing of daily demand maxima and minima, achieving exact-hour accuracy in more than two-thirds of the test days and within abs(1) hour in over 90\% of cases. Together, these results highlight both the predictive accuracy and operational relevance of the proposed framework, providing valuable insights for grid operators seeking optimized planning and control strategies under diverse demand scenarios.
STL-FFT-STFT-TCN-LSTM: An Effective Wave Height High Accuracy Prediction Model Fusing Time-Frequency Domain Features
Liu, Huipeng, Zhu, Zhichao, Zhou, Yuan, Li, Changlu
As the consumption of traditional energy sources intensifies and their adverse environmental impacts become more pronounced, wave energy stands out as a highly promising member of the renewable energy family due to its high energy density, stability, widespread distribution, and environmental friendliness. The key to its development lies in the precise prediction of Significant Wave Height (WVHT). However, wave energy signals exhibit strong nonlinearity, abrupt changes, multi-scale periodicity, data sparsity, and high-frequency noise interference; additionally, physical models for wave energy prediction incur extremely high computational costs. To address these challenges, this study proposes a hybrid model combining STL-FFT-STFT-TCN-LSTM. This model exploits the Seasonal-Trend Decomposition Procedure based on Loess (STL), Fast Fourier Transform (FFT), Short-Time Fourier Transform (STFT), Temporal Convolutional Network (TCN), and Long Short-Term Memory (LSTM) technologies. The model aims to optimize multi-scale feature fusion, capture extreme wave heights, and address issues related to high-frequency noise and periodic signals, thereby achieving efficient and accurate prediction of significant wave height. Experiments were conducted using hourly data from NOAA Station 41008 and 41047 spanning 2019 to 2022. The results showed that compared with other single models and hybrid models, the STL-FFT-STFT-TCN-LSTM model achieved significantly higher prediction accuracy in capturing extreme wave heights and suppressing high-frequency noise, with MAE reduced by 15.8\%-40.5\%, SMAPE reduced by 8.3\%-20.3\%, and R increased by 1.31\%-2.9\%; in ablation experiments, the model also demonstrated the indispensability of each component step, validating its superiority in multi-scale feature fusion.
Cluster Workload Allocation: A Predictive Approach Leveraging Machine Learning Efficiency
This research investigates how Machine Learning (ML) algorithms can assist in workload allocation strategies by detecting tasks with node affinity operators (referred to as constraint operators), which constrain their execution to a limited number of nodes. Using real-world Google Cluster Data (GCD) workload traces and the AGOCS framework, the study extracts node attributes and task constraints, then analyses them to identify suitable node-task pairings. It focuses on tasks that can be executed on either a single node or fewer than a thousand out of 12.5k nodes in the analysed GCD cluster. Task constraint operators are compacted, pre-processed with one-hot encoding, and used as features in a training dataset. Various ML classifiers, including Artificial Neural Networks, K-Nearest Neighbours, Decision Trees, Naive Bayes, Ridge Regression, Adaptive Boosting, and Bagging, are fine-tuned and assessed for accuracy and F1-scores. The final ensemble voting classifier model achieved 98% accuracy and a 1.5-1.8% misclassification rate for tasks with a single suitable node.