Energy
Case study of a differentiable heterogeneous multiphysics solver for a nuclear fusion application
Coughlin, Jack B., Joglekar, Archis, Brodrick, Jonathan, Lavin, Alexander
This work presents a case study of a heterogeneous multiphysics solver from the nuclear fusion domain. At the macroscopic scale, an auto-differentiable ODE solver in JAX computes the evolution of the pulsed power circuit and bulk plasma parameters for a compressing Z Pinch. The ODE solver requires a closure for the impedance of the plasma load obtained via root-finding at every timestep, which we solve efficiently using gradient-based Newton iteration. However, incorporating non-differentiable production-grade plasma solvers like Gkeyll (a C/CUDA plasma simulation suite) into a gradient-based workflow is non-trivial. The ''Tesseract'' software addresses this challenge by providing a multi-physics differentiable abstraction layer made fully compatible with JAX (through the `tesseract_jax` adapter). This architecture ensures end-to-end differentiability while allowing seamless interchange between high-fidelity solvers (Gkeyll), neural surrogates, and analytical approximations for rapid, progressive prototyping.
Real-time distortion prediction in metallic additive manufacturing via a physics-informed neural operator approach
Tian, Mingxuan, Mu, Haochen, Ding, Donghong, Li, Mengjiao, Ding, Yuhan, Zhao, Jianping
With the development of digital twins and smart manufacturing systems, there is an urgent need for real-time distortion field prediction to control defects in metal Additive Manufacturing (AM). However, numerical simulation methods suffer from high computational cost, long run-times that prevent real-time use, while conventional Machine learning (ML) models struggle to extract spatiotemporal features for long-horizon prediction and fail to decouple thermo-mechanical fields. This paper proposes a Physics-informed Neural Operator (PINO) to predict z and y-direction distortion for the future 15 s. Our method, Physics-informed Deep Operator Network-Recurrent Neural Network (PIDeepONet-RNN) employs trunk and branch network to process temperature history and encode distortion fields, respectively, enabling decoupling of thermo-mechanical responses. By incorporating the heat conduction equation as a soft constraint, the model ensures physical consistency and suppresses unphysical artifacts, thereby establishing a more physically consistent mapping between the thermal history and distortion. This is important because such a basis function, grounded in physical laws, provides a robust and interpretable foundation for predictions. The proposed models are trained and tested using datasets generated from experimentally validated Finite Element Method (FEM). Evaluation shows that the model achieves high accuracy, low error accumulation, time efficiency. The max absolute errors in the z and y-directions are as low as 0.9733 mm and 0.2049 mm, respectively. The error distribution shows high errors in the molten pool but low gradient norms in the deposited and key areas. The performance of PINO surrogate model highlights its potential for real-time long-horizon physics field prediction in controlling defects.
Transformer-Based Scalable Multi-Agent Reinforcement Learning for Networked Systems with Long-Range Interactions
Sinha, Vidur, Ustaomeroglu, Muhammed, Qu, Guannan
Multi-agent reinforcement learning (MARL) has shown promise for large-scale network control, yet existing methods face two major limitations. First, they typically rely on assumptions leading to decay properties of local agent interactions, limiting their ability to capture long-range dependencies such as cascading power failures or epidemic outbreaks. Second, most approaches lack generalizability across network topologies, requiring retraining when applied to new graphs. We introduce STACCA (Shared Transformer Actor-Critic with Counterfactual Advantage), a unified transformer-based MARL framework that addresses both challenges. STACCA employs a centralized Graph Transformer Critic to model long-range dependencies and provide system-level feedback, while its shared Graph Transformer Actor learns a generalizable policy capable of adapting across diverse network structures. Further, to improve credit assignment during training, STACCA integrates a novel counterfactual advantage estimator that is compatible with state-value critic estimates. We evaluate STACCA on epidemic containment and rumor-spreading network control tasks, demonstrating improved performance, network generalization, and scalability. These results highlight the potential of transformer-based MARL architectures to achieve scalable and generalizable control in large-scale networked systems.
Method of Manufactured Learning for Solver-free Training of Neural Operators
Training neural operators to approximate mappings between infinite-dimensional function spaces often requires extensive datasets generated by either demanding experimental setups or computationally expensive numerical solvers. This dependence on solver-based data limits scalability and constrains exploration across physical systems. Here we introduce the Method of Manufactured Learning (MML), a solver-independent framework for training neural operators using analytically constructed, physics-consistent datasets. Inspired by the classical method of manufactured solutions, MML replaces numerical data generation with functional synthesis, i.e., smooth candidate solutions are sampled from controlled analytical spaces, and the corresponding forcing fields are derived by direct application of the governing differential operators. During inference, setting these forcing terms to zero restores the original governing equations, allowing the trained neural operator to emulate the true solution operator of the system. The framework is agnostic to network architecture and can be integrated with any operator learning paradigm. In this paper, we employ Fourier neural operator as a representative example. Across canonical benchmarks including heat, advection, Burgers, and diffusion-reaction equations. MML achieves high spectral accuracy, low residual errors, and strong generalization to unseen conditions. By reframing data generation as a process of analytical synthesis, MML offers a scalable, solver-agnostic pathway toward constructing physically grounded neural operators that retain fidelity to governing laws without reliance on expensive numerical simulations or costly experimental data for training.
An Evaluation of Representation Learning Methods in Particle Physics Foundation Models
Chen, Michael, Kansal, Raghav, Gandrakota, Abhijith, Hao, Zichun, Ngadiuba, Jennifer, Spiropulu, Maria
We present a systematic evaluation of representation learning objectives for particle physics within a unified framework. Our study employs a shared transformer-based particle-cloud encoder with standardized preprocessing, matched sampling, and a consistent evaluation protocol on a jet classification dataset. We compare contrastive (supervised and self-supervised), masked particle modeling, and generative reconstruction objectives under a common training regimen. In addition, we introduce targeted supervised architectural modifications that achieve state-of-the-art performance on benchmark evaluations. This controlled comparison isolates the contributions of the learning objective, highlights their respective strengths and limitations, and provides reproducible baselines. We position this work as a reference point for the future development of foundation models in particle physics, enabling more transparent and robust progress across the community.
Multi-Agent Reinforcement Learning for Heterogeneous Satellite Cluster Resources Optimization
Hady, Mohamad A., Hu, Siyi, Pratama, Mahardhika, Cao, Zehong, Kowalczyk, Ryszard
This work investigates resource optimization in heterogeneous satellite clusters performing autonomous Earth Observation (EO) missions using Reinforcement Learning (RL). In the proposed setting, two optical satellites and one Synthetic Aperture Radar (SAR) satellite operate cooperatively in low Earth orbit to capture ground targets and manage their limited onboard resources efficiently. Traditional optimization methods struggle to handle the real-time, uncertain, and decentralized nature of EO operations, motivating the use of RL and Multi-Agent Reinforcement Learning (MARL) for adaptive decision-making. This study systematically formulates the optimization problem from single-satellite to multi-satellite scenarios, addressing key challenges including energy and memory constraints, partial observability, and agent heterogeneity arising from diverse payload capabilities. Using a near-realistic simulation environment built on the Basilisk and BSK-RL frameworks, we evaluate the performance and stability of state-of-the-art MARL algorithms such as MAPPO, HAPPO, and HATRPO. Results show that MARL enables effective coordination across heterogeneous satellites, balancing imaging performance and resource utilization while mitigating non-stationarity and inter-agent reward coupling. The findings provide practical insights into scalable, autonomous satellite operations and contribute a foundation for future research on intelligent EO mission planning under heterogeneous and dynamic conditions.
Which Way from B to A: The role of embedding geometry in image interpolation for Stable Diffusion
Karris, Nicholas, Durell, Luke, Flores, Javier, Emerson, Tegan
It can be shown that Stable Diffusion has a permutation-invariance property with respect to the rows of Contrastive Language-Image Pretraining (CLIP) embedding matrices. This inspired the novel observation that these embeddings can naturally be interpreted as point clouds in a Wasserstein space rather than as matrices in a Euclidean space. This perspective opens up new possibilities for understanding the geometry of embedding space. For example, when interpolating between embeddings of two distinct prompts, we propose reframing the interpolation problem as an optimal transport problem. By solving this optimal transport problem, we compute a shortest path (or geodesic) between embeddings that captures a more natural and geometrically smooth transition through the embedding space. This results in smoother and more coherent intermediate (interpolated) images when rendered by the Stable Diffusion generative model. We conduct experiments to investigate this effect, comparing the quality of interpolated images produced using optimal transport to those generated by other standard interpolation methods. The novel optimal transport--based approach presented indeed gives smoother image interpolations, suggesting that viewing the embeddings as point clouds (rather than as matrices) better reflects and leverages the geometry of the embedding space.
DIVIDE: A Framework for Learning from Independent Multi-Mechanism Data Using Deep Encoders and Gaussian Processes
Chawla, Vivek, Slautin, Boris, Pratiush, Utkarsh, Penumadu, Dayakar, Kalinin, Sergei
ABSTRACT Scientific datasets often arise from multiple independent mechanisms such as spati al, categorical or structural effects, whose combined influence obscures their individual contributions. We introduce DIVIDE, a framework that disentangles these influences by integrating mechanism - specific deep encoders with a structured Gaussian Process in a joint latent space. Disentanglement here refers to separating independently acting generative factors . The encoders isolate distinct mechanisms while the Gaussian Process captures their combined effect with calibrated uncertainty. The architecture supports structured priors, enabling interpretable and mechanism - aware prediction as well as efficient active l earning. Across benc hmarks, DIVIDE separates mechanisms, reproduces additive and scaled interactions, and remains robust under noise. The framework extends naturally to multifunctional datasets where mechanical, electromagnetic or optical responses coexist. INTRODUCTION Many real - world systems exhibit behavior driven by the combined influence of multiple independent mechanisms. These mechanisms may represent categorical factors, spatial dependencies, or nonlinear physical responses. While the scalar output of such systems is observable, the individual contributions of these mechanisms are often unknown and unmeasured. Modeling this type of data requires not only accurate predictions but also the ability to attribute variation in the output to specific, distinct sources. In thi s context, we use disentanglement to mean recovering those independently acting generative factors from observational data. Disentangling these contributions is particularly important in scientific and engineering domains where interpretability, causality, and mechanism - aware reasoning are essential. Partial solutions to this challenge have emerged from the field of disentangled representation learning, which seeks to identify independent factors of variation from high - dimensional data.
An Evaluation Framework for Network IDS/IPS Datasets: Leveraging MITRE ATT&CK and Industry Relevance Metrics
Tori, Adrita Rahman, Hasan, Khondokar Fida
The performance of Machine Learning (ML) and Deep Learning (DL)-based Intrusion Detection and Prevention Systems (IDS/IPS) is critically dependent on the relevance and quality of the datasets used for training and evaluation. However, current AI model evaluation practices for developing IDS/IPS focus predominantly on accuracy metrics, often overlooking whether datasets represent industry-specific threats. To address this gap, we introduce a novel multi-dimensional framework that integrates the MITRE ATT&CK knowledge base for threat intelligence and employs five complementary metrics that together provide a comprehensive assessment of dataset suitability. Methodologically, this framework combines threat intelligence, natural language processing, and quantitative analysis to assess the suitability of datasets for specific industry contexts. Applying this framework to nine publicly available IDS/IPS datasets reveals significant gaps in threat coverage, particularly in the healthcare, energy, and financial sectors. In particular, recent datasets (e.g., CIC-IoMT, CIC-UNSW-NB15) align better with sector-specific threats, whereas others, like CICIoV-24, underperform despite their recency. Our findings provide a standardized, interpretable approach for selecting datasets aligned with sector-specific operational requirements, ultimately enhancing the real-world effectiveness of AI-driven IDS/IPS deployments. The efficiency and practicality of the framework are validated through deployment in a real-world case study, underscoring its capacity to inform dataset selection and enhance the effectiveness of AI-driven IDS/IPS in operational environments.
EcoFlight: Finding Low-Energy Paths Through Obstacles for Autonomous Sensing Drones
Leyva, Jordan, Vera, Nahim J. Moran, Xu, Yihan, Durasno, Adrien, Romero, Christopher U., Chimuka, Tendai, Ramirez, Gabriel O. Huezo, Dong, Ziqian, Rojas-Cessa, Roberto
Obstacle avoidance path planning for uncrewed aerial vehicles (UAVs), or drones, is rarely addressed in most flight path planning schemes, despite obstacles being a realistic condition. Obstacle avoidance can also be energy-intensive, making it a critical factor in efficient point-to-point drone flights. To address these gaps, we propose EcoFlight, an energy-efficient pathfinding algorithm that determines the lowest-energy route in 3D space with obstacles. The algorithm models energy consumption based on the drone propulsion system and flight dynamics. We conduct extensive evaluations, comparing EcoFlight with direct-flight and shortest-distance schemes. The simulation results across various obstacle densities show that EcoFlight consistently finds paths with lower energy consumption than comparable algorithms, particularly in high-density environments. We also demonstrate that a suitable flying speed can further enhance energy savings.