AITopics

Climate policy studies require models that capture the combined effects of multiple greenhouse gases on global temperature, but these models are computationally expensive and difficult to embed in reinforcement learning. We present a multi-agent reinforcement learning (MARL) framework that integrates a high-fidelity, highly efficient climate surrogate directly in the environment loop, enabling regional agents to learn climate policies under multi-gas dynamics. As a proof of concept, we introduce a recurrent neural network architecture pretrained on ($20{,}000$) multi-gas emission pathways to surrogate the climate model CICERO-SCM. The surrogate model attains near-simulator accuracy with global-mean temperature RMSE $\approx 0.0004 \mathrm{K}$ and approximately $1000\times$ faster one-step inference. When substituted for the original simulator in a climate-policy MARL setting, it accelerates end-to-end training by $>\!100\times$. We show that the surrogate and simulator converge to the same optimal policies and propose a methodology to assess this property in cases where using the simulator is intractable. Our work allows to bypass the core computational bottleneck without sacrificing policy fidelity, enabling large-scale multi-agent experiments across alternative climate-policy regimes with multi-gas dynamics and high-fidelity climate response.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2510.07971

Country:

Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Energy > Energy Policy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Profit Mirage: Revisiting Information Leakage in LLM-based Financial Agents

Li, Xiangyu, Zeng, Yawen, Xing, Xiaofen, Xu, Jin, Xu, Xiangmin

LLM-based financial agents have attracted widespread excitement for their ability to trade like human experts. However, most systems exhibit a "profit mirage": dazzling back-tested returns evaporate once the model's knowledge window ends, because of the inherent information leakage in LLMs. In this paper, we systematically quantify this leakage issue across four dimensions and release FinLake-Bench, a leakage-robust evaluation benchmark. Furthermore, to mitigate this issue, we introduce FactFin, a framework that applies counterfactual perturbations to compel LLM-based agents to learn causal drivers instead of memorized outcomes. FactFin integrates four core components: Strategy Code Generator, Retrieval-Augmented Generation, Monte Carlo Tree Search, and Counterfactual Simulator. Extensive experiments show that our method surpasses all baselines in out-of-sample generalization, delivering superior risk-adjusted performance.

large language model, machine learning, natural language, (17 more...)

2510.0792

Country:

North America (0.68)
Asia > China (0.47)

Genre: Research Report (0.65)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (1.00)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Control Synthesis of Cyber-Physical Systems for Real-Time Specifications through Causation-Guided Reinforcement Learning

Tang, Xiaochen, Zhang, Zhenya, Zhang, Miaomiao, An, Jie

In real-time and safety-critical cyber-physical systems (CPSs), control synthesis must guarantee that generated policies meet stringent timing and correctness requirements under uncertain and dynamic conditions. Signal temporal logic (STL) has emerged as a powerful formalism of expressing real-time constraints, with its semantics enabling quantitative assessment of system behavior. Meanwhile, reinforcement learning (RL) has become an important method for solving control synthesis problems in unknown environments. Recent studies incorporate STL-based reward functions into RL to automatically synthesize control policies. However, the automatically inferred rewards obtained by these methods represent the global assessment of a whole or partial path but do not accumulate the rewards of local changes accurately, so the sparse global rewards may lead to non-convergence and unstable training performances. In this paper, we propose an online reward generation method guided by the online causation monitoring of STL. Our approach continuously monitors system behavior against an STL specification at each control step, computing the quantitative distance toward satisfaction or violation and thereby producing rewards that reflect instantaneous state dynamics. Additionally, we provide a smooth approximation of the causation semantics to overcome the discontinuity of the causation semantics and make it differentiable for using deep-RL methods. We have implemented a prototype tool and evaluated it in the Gym environment on a variety of continuously controlled benchmarks. Experimental results show that our proposed STL-guided RL method with online causation semantics outperforms existing relevant STL-guided RL methods, providing a more robust and efficient reward generation framework for deep-RL.

machine learning, real time system, specification, (18 more...)

2510.07715

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > Los Angeles County (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Qaim, Waleed Bin, Ometov, Aleksandr, Campolo, Claudia, Molinaro, Antonella, Lohan, Elena Simona, Nurmi, Jari

Reinforcement Learning-based Task Offloading in the Internet of Wearable Things

Over the years, significant contributions have been made by the research and industrial sectors to improve wearable devices towards the Internet of Wearable Things (IoWT) paradigm. However, wearables are still facing several challenges. Many stem from the limited battery power and insufficient computation resources available on wearable devices. On the other hand, with the popularity of smart wearables, there is a consistent increase in the development of new computationally intensive and latency-critical applications. In such a context, task offloading allows wearables to leverage the resources available on nearby edge devices to enhance the overall user experience. This paper proposes a framework for Reinforcement Learning (RL)-based task offloading in the IoWT. We formulate the task offloading process considering the tradeoff between energy consumption and task accomplishment time. Moreover, we model the task offloading problem as a Markov Decision Process (MDP) and utilize the Q-learning technique to enable the wearable device to make optimal task offloading decisions without prior knowledge. We evaluate the performance of the proposed framework through extensive simulations for various applications and system configurations conducted in the ns-3 network simulator. We also show how varying the main system parameters of the Q-learning algorithm affects the overall performance in terms of average task accomplishment time, average energy consumption, and percentage of tasks offloaded.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2510.07487

Country:

Europe > Finland (0.15)
Europe > Italy (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Energy > Energy Storage (0.66)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Woiwode, Dominik, Marten, Jakob, Rosenhahn, Bodo

A Rotation-Invariant Embedded Platform for (Neural) Cellular Automata

This paper presents a rotation-invariant embedded platform for simulating (neural) cellular automata (NCA) in modular robotic systems. Inspired by previous work on physical NCA, we introduce key innovations that overcome limitations in prior hardware designs. Our platform features a symmetric, modular structure, enabling seamless connections between cells regardless of orientation. Additionally, each cell is battery-powered, allowing it to operate independently and retain its state even when disconnected from the collective. To demonstrate the platform's applicability, we present a novel rotation-invariant NCA model for isotropic shape classification. The proposed system provides a robust foundation for exploring the physical realization of NCA, with potential applications in distributed robotic systems and self-organizing structures.

artificial intelligence, cellular automata, machine learning, (18 more...)

2510.0744

Country: Europe > Germany (0.46)

Genre: Research Report (0.64)

Industry:

Electrical Industrial Apparatus (0.88)
Energy > Energy Storage (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (0.88)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Liu, Penghang, Fons, Elizabeth, Vyetrenko, Svitlana, Borrajo, Daniel, Potluru, Vamsi, Veloso, Manuela

TS-Agent: A Time Series Reasoning Agent with Iterative Statistical Insight Gathering

Large language models (LLMs) have shown strong abilities in reasoning and problem solving, but recent studies reveal that they still struggle with time series reasoning tasks, where outputs are often affected by hallucination or knowledge leakage. In this work we propose TS-Agent, a time series reasoning agent that leverages LLMs strictly for what they excel at, i.e., gathering evidence and synthesizing it into conclusions through step-by-step reasoning, while delegating the extraction of statistical and structural information to time series analytical tools. Instead of mapping time series into text tokens, images, or embeddings, our agent interacts with raw numeric sequences through atomic operators, records outputs in an explicit evidence log, and iteratively refines its reasoning under the guidance of a self-critic and a final quality gate. This design avoids multi-modal alignment training, preserves the native form of time series, ensures interpretability and verifiability, and mitigates knowledge leakage or hallucination. Empirically, we evaluate the agent on established benchmarks. Our experiments show that TS-Agent achieves performance comparable to state-of-the-art LLMs on understanding benchmarks, and delivers significant improvements on reasoning tasks, where existing models often rely on memorization and fail in zero-shot settings.

large language model, machine learning, natural language, (17 more...)

2510.07432

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.70)
Banking & Finance (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Dorise, Adrien, Bellizzi, Marjorie, Girard, Adrien, Francesconi, Benjamin, May, Stéphane

Explaining raw data complexity to improve satellite onboard processing

With increasing processing power, deploying AI models for remote sensing directly onboard satellites is becoming feasible. However, new constraints arise, mainly when using raw, unprocessed sensor data instead of preprocessed ground-based products. While current solutions primarily rely on preprocessed sensor images, few approaches directly leverage raw data. This study investigates the effects of utilising raw data on deep learning models for object detection and classification tasks. We introduce a simulation workflow to generate raw-like products from high-resolution L1 imagery, enabling systemic evaluation. Two object detection models (YOLOv11n and YOLOX-S) are trained on both raw and L1 datasets, and their performance is compared using standard detection metrics and explainability tools. Results indicate that while both models perform similarly at low to medium confidence thresholds, the model trained on raw data struggles with object boundary identification at high confidence levels. It suggests that adapting AI architectures with improved contouring methods can enhance object detection on raw images, improving onboard AI for remote sensing.

artificial intelligence, deep learning, machine learning, (18 more...)

2510.06858

Country: Europe > France (0.29)

Genre: Research Report > New Finding (0.47)

Industry:

Transportation > Marine (0.93)
Energy (0.89)
Transportation > Freight & Logistics Services > Shipping (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Tackett, Justin, Francis, Benjamin, Garcia, Luis, Grimsman, David, Warnick, Sean

Machine-Learning Driven Load Shedding to Mitigate Instability Attacks in Power Grids

Abstract--Critical infrastructures are becoming increasingly complex as our society becomes increasingly dependent on them. This complexity opens the door to new possibilities for attacks and a need for new defense strategies. Our work focuses on instability attacks on the power grid, wherein an attacker causes cascading outages by introducing unstable dynamics into the system. When stress is place on the power grid, a standard mitigation approach is load-shedding: the system operator chooses a set of loads to shut off until the situation is resolved. While this technique is standard, there is no systematic approach to choosing which loads will stop an instability attack. We show a proof of concept on the IEEE 14 Bus System using the Achilles Heel T echnologies Power Grid Analyzer, and show through an implementation of modified Prony analysis (MPA) that MPA is a viable method for detecting instability attacks and triggering defense mechanisms. Throughout the past two hundred years, the power grid has become a core part of the infrastructure of the world. Every modern facility relies on electricity to sustain the way of life that has become prevalent in first world countries, powering everything from life sustaining equipment to financial transaction infrastructure.

artificial intelligence, instability attack, machine learning, (15 more...)

2509.26532

Country:

Europe (0.68)
North America > United States > Utah (0.28)

Genre: Research Report (0.50)

Industry:

Energy > Power Industry (1.00)
Government > Military > Cyberwarfare (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Lebedev, Andreas, Das, Abhinav, Pappert, Sven, Schlüter, Stephan

Analyzing Uncertainty Quantification in Statistical and Deep Learning Models for Probabilistic Electricity Price Forecasting

Precise probabilistic forecasts are fundamental for energy risk management, and there is a wide range of both statistical and machine learning models for this purpose. Inherent to these probabilistic models is some form of uncertainty quantification. However, most models do not capture the full extent of uncertainty, which arises not only from the data itself but also from model and distributional choices. In this study, we examine uncertainty quantification in state-of-the-art statistical and deep learning probabilistic forecasting models for electricity price forecasting in the German market. In particular, we consider deep distributional neural networks (DDNNs) and augment them with an ensemble approach, Monte Carlo (MC) dropout, and conformal prediction to account for model uncertainty. Additionally, we consider the LASSO-estimated autoregressive (LEAR) approach combined with quantile regression averaging (QRA), generalized autoregressive conditional heteroskedasticity (GARCH), and conformal prediction. Across a range of performance metrics, we find that the LEAR-based models perform well in terms of probabilistic forecasting, irrespective of the uncertainty quantification method. Furthermore, we find that DDNNs benefit from incorporating both data and model uncertainty, improving both point and probabilistic forecasting. Uncertainty itself appears to be best captured by the models using conformal prediction. Overall, our extensive study shows that all models under consideration perform competitively. However, their relative performance depends on the choice of metrics for point and probabilistic forecasting.

artificial intelligence, deep learning, machine learning, (17 more...)

2509.19417

Country:

Europe > Germany (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Robu, Valentin, Klein, Mark

Using utility graphs to search for Pareto-optimal outcomes in complex, interdependent issue negotiations

Negotiation is a powerful tool for modelling complex interactions between self - interested agents, which can be people, companies or increasingly, AI - enabled autonomous agents, that aim to reach the best agreement for their human owners. While negotiation is often thought as a competitive process, in which one part y wins and the other one l oses, in practice most real negotiations involve more complex, win - win scenarios ( Raif fa [20]), in which agreements can be found that maximize the utilities of both agents . S uch outcomes (agreements) are called Pareto - efficient, i.e. it is not possible to find another outcome that would increase one agent's utility, without making another agent worse off. Yet, finding agreements that are Pareto - efficient is a challenging computational problem, especially in complex negotiation domains, where issues negotiated upon are interdependent (i.e. the utility of the value chosen for one negotiation issue depends strongly on the choice for other one s). Consider, for example, the negotiations between parties in a logistic supply chain: producers want to have certain combinations of resources/quantities, delivered at certain times to be able to produce their goods, whil e suppliers may face similar constraints in their cost function for supplying different combinations of items . Or the peer - to - peer negotiations between prosumers in a decentralised power grid, that require certain amounts of energy at different times and locations, which involve non - linear constraints, especially if the capacity of the distribution network is limited .

artificial intelligence, graph, machine learning, (20 more...)

2509.10885

Country:

North America > United States (0.46)
Europe > Netherlands (0.28)

Genre: Research Report (1.00)

Industry:

Transportation (0.68)
Information Technology > Services (0.46)
Energy > Power Industry (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)