Goto

Collaborating Authors

 plasma


Case study of a differentiable heterogeneous multiphysics solver for a nuclear fusion application

Coughlin, Jack B., Joglekar, Archis, Brodrick, Jonathan, Lavin, Alexander

arXiv.org Artificial Intelligence

This work presents a case study of a heterogeneous multiphysics solver from the nuclear fusion domain. At the macroscopic scale, an auto-differentiable ODE solver in JAX computes the evolution of the pulsed power circuit and bulk plasma parameters for a compressing Z Pinch. The ODE solver requires a closure for the impedance of the plasma load obtained via root-finding at every timestep, which we solve efficiently using gradient-based Newton iteration. However, incorporating non-differentiable production-grade plasma solvers like Gkeyll (a C/CUDA plasma simulation suite) into a gradient-based workflow is non-trivial. The ''Tesseract'' software addresses this challenge by providing a multi-physics differentiable abstraction layer made fully compatible with JAX (through the `tesseract_jax` adapter). This architecture ensures end-to-end differentiability while allowing seamless interchange between high-fidelity solvers (Gkeyll), neural surrogates, and analytical approximations for rapid, progressive prototyping.



Why the AI Industry Is Betting on a Fusion Energy Breakthrough

TIME - Tech

Booth is a reporter at TIME. Booth is a reporter at TIME. When Sam Altman arrived at Helion Energy's small Redmond, Wash., office in early 2014, nuclear-fusion textbooks tucked under his arm, the company was focusing its efforts on research and development. By the time he left, several days later, he had persuaded the fusion-energy startup to chart a more aggressive path toward deployment, CEO David Kirtley recalls. A year later, Altman, who was co-founding OpenAI around the same time, invested $9.5 million in Helion, taking the role of chairman.


Fast and Interpretable Protein Substructure Alignment via Optimal Transport

Wang, Zhiyu, Zhou, Bingxin, Wang, Jing, Tan, Yang, Zhao, Weishu, Liò, Pietro, Hong, Liang

arXiv.org Artificial Intelligence

Proteins are essential biological macromolecules that execute life functions. Local motifs within protein structures, such as active sites, are the most critical components for linking structure to function and are key to understanding protein evolution and enabling protein engineering. Existing computational methods struggle to identify and compare these local structures, which leaves a significant gap in understanding protein structures and harnessing their functions. This study presents PLASMA, the first deep learning framework for efficient and interpretable residue-level protein substructure alignment. We reformulate the problem as a regularized optimal transport task and leverage differentiable Sinkhorn iterations. Through extensive quantitative evaluations and three biological case studies, we demonstrate that PLASMA achieves accurate, lightweight, and interpretable residue-level alignment. Additionally, we introduce PLASMA-PF, a training-free variant that provides a practical alternative when training data are unavailable. Our method addresses a critical gap in protein structure analysis tools and offers new opportunities for functional annotation, evolutionary studies, and structure-based drug design. Proteins are essential macromolecules responsible for life functions, from catalysis and signal transduction to structural support and transport. Functional motifs (e.g., catalytic residues, binding pockets, metal-binding sites) are critical for understanding mechanisms, designing therapeutics, and guiding protein engineering (Mills et al., 2018).


GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations

Paischer, Fabian, Galletti, Gianluca, Hornsby, William, Setinek, Paul, Zanisi, Lorenzo, Carey, Naomi, Pamela, Stanislas, Brandstetter, Johannes

arXiv.org Machine Learning

Nuclear fusion plays a pivotal role in the quest for reliable and sustainable energy production. A major roadblock to viable fusion power is understanding plasma turbulence, which significantly impairs plasma confinement, and is vital for next-generation reactor design. Plasma turbulence is governed by the nonlinear gyrokinetic equation, which evolves a 5D distribution function over time. Due to its high computational cost, reduced-order models are often employed in practice to approximate turbulent transport of energy. However, they omit nonlinear effects unique to the full 5D dynamics. To tackle this, we introduce GyroSwin, the first scalable 5D neural surrogate that can model 5D nonlinear gyrokinetic simulations, thereby capturing the physical phenomena neglected by reduced models, while providing accurate estimates of turbulent heat transport.GyroSwin (i) extends hierarchical Vision Transformers to 5D, (ii) introduces cross-attention and integration modules for latent 3D$\leftrightarrow$5D interactions between electrostatic potential fields and the distribution function, and (iii) performs channelwise mode separation inspired by nonlinear physics. We demonstrate that GyroSwin outperforms widely used reduced numerics on heat flux prediction, captures the turbulent energy cascade, and reduces the cost of fully resolved nonlinear gyrokinetics by three orders of magnitude while remaining physically verifiable. GyroSwin shows promising scaling laws, tested up to one billion parameters, paving the way for scalable neural surrogates for gyrokinetic simulations of plasma turbulence.




TGLF-SINN: Deep Learning Surrogate Model for Accelerating Turbulent Transport Modeling in Fusion

Cao, Yadi, Zhang, Futian, Liu, Wesley, Neiser, Tom, Meneghini, Orso, Fuller, Lawson, Smith, Sterling, Nazikian, Raffi, Sammuli, Brian, Yu, Rose

arXiv.org Artificial Intelligence

The Trapped Gyro-Landau Fluid (TGLF) model provides fast, accurate predictions of turbulent transport in tokamaks, but whole device simulations requiring thousands of evaluations remain computationally expensive. Neural network (NN) surrogates offer accelerated inference with fully differentiable approximations that enable gradient-based coupling but typically require large training datasets to capture transport flux variations across plasma conditions, creating significant training burden and limiting applicability to expensive gyrokinetic simulations. We propose \textbf{TGLF-SINN (Spectra-Informed Neural Network)} with three key innovations: (1) principled feature engineering that reduces target prediction range, simplifying the learning task; (2) physics-guided regularization of transport spectra to improve generalization under sparse data; and (3) Bayesian Active Learning (BAL) to strategically select training samples based on model uncertainty, reducing data requirements while maintaining accuracy. Our approach achieves superior performance with significantly less training data. In offline settings, TGLF-SINN reduces logarithmic root mean squared error (LRMSE) by 12. 4\% compared to the current baseline \base. Using only 25\% of the complete dataset with BAL, we achieve LRMSE only 0.0165 higher than \base~and 0.0248 higher than our offline model (0.0583). In downstream flux matching applications, our NN surrogate provides 45x speedup over TGLF while maintaining comparable accuracy, demonstrating potential for training efficient surrogates for higher-fidelity models where data acquisition is costly and sparse.


A Implementation Details

Neural Information Processing Systems

In this section, we derive the computational complexity of the TIP algorithm. Table 4: Hyperparameters used for optimization in MPC procedure for closed-loop control problems. A.4 Cost Function Details We set n = 15 and m =1 for our Monte Carlo estimate of the cost function for each problem. As mentioned in the main text, we use the iCEM method from Pinneri et al. In Tables 4 and 5, we present the hyperparameters used for the planning algorithm across each problem.


DiaLLMs: EHR Enhanced Clinical Conversational System for Clinical Test Recommendation and Diagnosis Prediction

Ren, Weijieying, Zhao, Tianxiang, Wang, Lei, Wang, Tianchun, Honavar, Vasant

arXiv.org Artificial Intelligence

Recent advances in Large Language Models (LLMs) have led to remarkable progresses in medical consultation. However, existing medical LLMs overlook the essential role of Electronic Health Records (EHR) and focus primarily on diagnosis recommendation, limiting their clinical applicability. We propose DiaLLM, the first medical LLM that integrates heterogeneous EHR data into clinically grounded dialogues, enabling clinical test recommendation, result interpretation, and diagnosis prediction to better align with real-world medical practice. To construct clinically grounded dialogues from EHR, we design a Clinical Test Reference (CTR) strategy that maps each clinical code to its corresponding description and classifies test results as "normal" or "abnormal". Additionally, DiaLLM employs a reinforcement learning framework for evidence acquisition and automated diagnosis. To handle the large action space, we introduce a reject sampling strategy to reduce redundancy and improve exploration efficiency. Furthermore, a confirmation reward and a class-sensitive diagnosis reward are designed to guide accurate diagnosis prediction. Extensive experimental results demonstrate that DiaLLM outperforms baselines in clinical test recommendation and diagnosis prediction.