AITopics | Balaprakash, Prasanna

Collaborating Authors

Balaprakash, Prasanna

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How does ion temperature gradient turbulence depend on magnetic geometry? Insights from data and machine learning

Landreman, Matt, Choi, Jong Youl, Alves, Caio, Balaprakash, Prasanna, Churchill, R. Michael, Conlin, Rory, Roberg-Clark, Gareth

arXiv.org Artificial IntelligenceFeb-17-2025

Magnetic geometry has a significant effect on the level of turbulent transport in fusion plasmas. Here, we model and analyze this dependence using multiple machine learning methods and a dataset of > 200,000 nonlinear simulations of ion-temperature-gradient turbulence in diverse non-axisymmetric geometries. The dataset is generated using a large collection of both optimized and randomly generated stellarator equilibria. At fixed gradients, the turbulent heat flux varies between geometries by several orders of magnitude. Trends are apparent among the configurations with particularly high or low heat flux. Regression and classification techniques from machine learning are then applied to extract patterns in the dataset. Due to a symmetry of the gyrokinetic equation, the heat flux and regressions thereof should be invariant to translations of the raw features in the parallel coordinate, similar to translation invariance in computer vision applications. Multiple regression models including convolutional neural networks (CNNs) and decision trees can achieve reasonable predictive power for the heat flux in held-out test configurations, with highest accuracy for the CNNs. Using Spearman correlation, sequential feature selection, and Shapley values to measure feature importance, it is consistently found that the most important geometric lever on the heat flux is the flux surface compression in regions of bad curvature. The second most important feature relates to the magnitude of geodesic curvature. These two features align remarkably with surrogates that have been proposed based on theory, while the methods here allow a natural extension to more features for increased accuracy. The dataset, released with this publication, may also be used to test other proposed surrogates, and we find many previously published proxies do correlate well with both the heat flux and stability boundary.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.11657

Country: North America > United States > Maryland (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Bayesian optimized deep ensemble for uncertainty quantification of deep neural networks: a system safety case study on sodium fast reactor thermal stratification modeling

Abulawi, Zaid, Hu, Rui, Balaprakash, Prasanna, Liu, Yang

arXiv.org Machine LearningDec-11-2024

Accurate predictions and uncertainty quantification (UQ) are essential for decision-making in risk-sensitive fields such as system safety modeling. Deep ensembles (DEs) are efficient and scalable methods for UQ in Deep Neural Networks (DNNs); however, their performance is limited when constructed by simply retraining the same DNN multiple times with randomly sampled initializations. To overcome this limitation, we propose a novel method that combines Bayesian optimization (BO) with DE, referred to as BODE, to enhance both predictive accuracy and UQ. We apply BODE to a case study involving a Densely connected Convolutional Neural Network (DCNN) trained on computational fluid dynamics (CFD) data to predict eddy viscosity in sodium fast reactor thermal stratification modeling. Compared to a manually tuned baseline ensemble, BODE estimates total uncertainty approximately four times lower in a noise-free environment, primarily due to the baseline's overestimation of aleatoric uncertainty. Specifically, BODE estimates aleatoric uncertainty close to zero, while aleatoric uncertainty dominates the total uncertainty in the baseline ensemble. We also observe a reduction of more than 30% in epistemic uncertainty. When Gaussian noise with standard deviations of 5% and 10% is introduced into the data, BODE accurately fits the data and estimates uncertainty that aligns with the data noise. These results demonstrate that BODE effectively reduces uncertainty and enhances predictions in data-driven models, making it a flexible approach for various applications requiring accurate predictions and robust UQ.

artificial intelligence, machine learning, noise, (17 more...)

arXiv.org Machine Learning

2412.08776

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.66)

Industry:

Energy > Power Industry > Utilities > Nuclear (1.00)
Health & Medicine (0.93)
Energy > Oil & Gas > Upstream (0.93)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Generalizable Prediction Model of Molten Salt Mixture Density with Chemistry-Informed Transfer Learning

Barra, Julian, Shahbazi, Shayan, Birri, Anthony, Chahal, Rajni, Isah, Ibrahim, Anwar, Muhammad Nouman, Starkus, Tyler, Balaprakash, Prasanna, Lam, Stephen

arXiv.org Artificial IntelligenceOct-19-2024

Optimally designing molten salt applications requires knowledge of their thermophysical properties, but existing databases are incomplete, and experiments are challenging. Ideal mixing and Redlich-Kister models are computationally cheap but lack either accuracy or generality. To address this, a transfer learning approach using deep neural networks (DNNs) is proposed, combining Redlich-Kister models, experimental data, and ab initio properties. The approach predicts molten salt density with high accuracy ($r^{2}$ > 0.99, MAPE < 1%), outperforming the alternatives.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2410.1512

Country: North America > United States > Massachusetts > Middlesex County > Lowell (0.29)

Genre: Research Report (0.64)

Industry:

Energy > Energy Storage (1.00)
Energy > Power Industry > Utilities > Nuclear (0.93)
Energy > Renewable (0.69)
Electrical Industrial Apparatus (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Scalable Training of Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN

Pasini, Massimiliano Lupo, Choi, Jong Youl, Mehta, Kshitij, Zhang, Pei, Rogers, David, Bae, Jonghyun, Ibrahim, Khaled Z., Aji, Ashwin M., Schulz, Karl W., Polo, Jorda, Balaprakash, Prasanna

arXiv.org Artificial IntelligenceJun-28-2024

We present our work on developing and training scalable graph foundation models (GFM) using HydraGNN, a multi-headed graph convolutional neural network architecture. HydraGNN expands the boundaries of graph neural network (GNN) in both training scale and data diversity. It abstracts over message passing algorithms, allowing both reproduction of and comparison across algorithmic innovations that define convolution in GNNs. This work discusses a series of optimizations that have allowed scaling up the GFM training to tens of thousands of GPUs on datasets that consist of hundreds of millions of graphs. Our GFMs use multi-task learning (MTL) to simultaneously learn graph-level and node-level properties of atomistic structures, such as the total energy and atomic forces. Using over 150 million atomistic structures for training, we illustrate the performance of our approach along with the lessons learned on two United States Department of Energy (US-DOE) supercomputers, namely the Perlmutter petascale system at the National Energy Research Scientific Computing Center and the Frontier exascale system at Oak Ridge National Laboratory. The HydraGNN architecture enables the GFM to achieve near-linear strong scaling performance using more than 2,000 GPUs on Perlmutter and 16,000 GPUs on Frontier. Hyperparameter optimization (HPO) was performed on over 64,000 GPUs on Frontier to select GFM architectures with high accuracy. Early stopping was applied on each GFM architecture for energy awareness in performing such an extreme-scale task. The training of an ensemble of highest-ranked GFM architectures continued until convergence to establish uncertainty quantification (UQ) capabilities with ensemble learning. Our contribution opens the door for rapidly developing, training, and deploying GFMs using large-scale computational resources to enable AI-accelerated materials discovery and design.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2406.12909

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Physics-Informed Heterogeneous Graph Neural Networks for DC Blocker Placement

Jin, Hongwei, Balaprakash, Prasanna, Zou, Allen, Ghysels, Pieter, Krishnapriyan, Aditi S., Mate, Adam, Barnes, Arthur, Bent, Russell

arXiv.org Artificial IntelligenceMay-16-2024

The threat of geomagnetic disturbances (GMDs) to the reliable operation of the bulk energy system has spurred the development of effective strategies for mitigating their impacts. One such approach involves placing transformer neutral blocking devices, which interrupt the path of geomagnetically induced currents (GICs) to limit their impact. The high cost of these devices and the sparsity of transformers that experience high GICs during GMD events, however, calls for a sparse placement strategy that involves high computational cost. To address this challenge, we developed a physics-informed heterogeneous graph neural network (PIHGNN) for solving the graph-based dc-blocker placement problem. Our approach combines a heterogeneous graph neural network (HGNN) with a physics-informed neural network (PINN) to capture the diverse types of nodes and edges in ac/dc networks and incorporates the physical laws of the power grid. We train the PIHGNN model using a surrogate power flow model and validate it using case studies. Results demonstrate that PIHGNN can effectively and efficiently support the deployment of GIC dc-current blockers, ensuring the continued supply of electricity to meet societal demands. Our approach has the potential to contribute to the development of more reliable and resilient power grids capable of withstanding the growing threat that GMDs pose.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2405.10389

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Power Industry (1.00)
Energy > Oil & Gas > Upstream (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

Wang, Xiao, Tsaris, Aristeidis, Liu, Siyan, Choi, Jong-Youl, Fan, Ming, Zhang, Wei, Yin, Junqi, Ashfaq, Moetasim, Lu, Dan, Balaprakash, Prasanna

arXiv.org Artificial IntelligenceApr-22-2024

Earth system predictability is challenged by the complexity of environmental dynamics and the multitude of variables involved. Current AI foundation models, although advanced by leveraging large and heterogeneous data, are often constrained by their size and data integration, limiting their effectiveness in addressing the full range of Earth system prediction challenges. To overcome these limitations, we introduce the Oak Ridge Base Foundation Model for Earth System Predictability (ORBIT), an advanced vision-transformer model that scales up to 113 billion parameters using a novel hybrid tensor-data orthogonal parallelism technique. As the largest model of its kind, ORBIT surpasses the current climate AI foundation model size by a thousandfold. Performance scaling tests conducted on the Frontier supercomputer have demonstrated that ORBIT achieves 230 to 707 PFLOPS, with scaling efficiency maintained at 78% to 96% across 24,576 AMD GPUs. These breakthroughs establish new advances in AI-driven climate modeling and demonstrate promise to significantly improve the Earth system predictability.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.14712

Country: North America > United States (0.94)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Sequence Length Scaling in Vision Transformers for Scientific Images on Frontier

Tsaris, Aristeidis, Zhang, Chengming, Wang, Xiao, Yin, Junqi, Liu, Siyan, Ashfaq, Moetasim, Fan, Ming, Choi, Jong Youl, Wahib, Mohamed, Lu, Dan, Balaprakash, Prasanna, Wang, Feiyi

arXiv.org Artificial IntelligenceApr-17-2024

Vision Transformers (ViTs) are pivotal for foundational models in scientific imagery, including Earth science applications, due to their capability to process large sequence lengths. While transformers for text has inspired scaling sequence lengths in ViTs, yet adapting these for ViTs introduces unique challenges. We develop distributed sequence parallelism for ViTs, enabling them to handle up to 1M tokens. Our approach, leveraging DeepSpeed-Ulysses and Long-Sequence-Segmentation with model sharding, is the first to apply sequence parallelism in ViT training, achieving a 94% batch scaling efficiency on 2,048 AMD-MI250X GPUs. Evaluating sequence parallelism in ViTs, particularly in models up to 10B parameters, highlighted substantial bottlenecks. We countered these with hybrid sequence, pipeline, tensor parallelism, and flash attention strategies, to scale beyond single GPU memory limits. Our method significantly enhances climate modeling accuracy by 20% in temperature predictions, marking the first training of a transformer model on a full-attention matrix over 188K sequence length.

artificial intelligence, machine learning, sequence length, (17 more...)

arXiv.org Artificial Intelligence

2405.1578

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Health & Medicine (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Network architecture search of X-ray based scientific applications

Balaji, Adarsha, Hadidi, Ramyad, Kollmer, Gregory, Fouda, Mohammed E., Balaprakash, Prasanna

arXiv.org Artificial IntelligenceApr-16-2024

X-ray and electron diffraction-based microscopy use bragg peak detection and ptychography to perform 3-D imaging at an atomic resolution. Typically, these techniques are implemented using computationally complex tasks such as a Psuedo-Voigt function or solving a complex inverse problem. Recently, the use of deep neural networks has improved the existing state-of-the-art approaches. However, the design and development of the neural network models depends on time and labor intensive tuning of the model by application experts. To that end, we propose a hyperparameter (HPS) and neural architecture search (NAS) approach to automate the design and optimization of the neural network models for model size, energy consumption and throughput. We demonstrate the improved performance of the auto-tuned models when compared to the manually tuned BraggNN and PtychoNN benchmark. We study and demonstrate the importance of the exploring the search space of tunable hyperparameters in enhancing the performance of bragg peak detection and ptychographic reconstruction. Our NAS and HPS of (1) BraggNN achieves a 31.03\% improvement in bragg peak detection accuracy with a 87.57\% reduction in model size, and (2) PtychoNN achieves a 16.77\% improvement in model accuracy and a 12.82\% reduction in model size when compared to the baseline PtychoNN model. When inferred on the Orin-AGX platform, the optimized Braggnn and Ptychonn models demonstrate a 10.51\% and 9.47\% reduction in inference latency and a 44.18\% and 15.34\% reduction in energy consumption when compared to their respective baselines, when inferred in the Orin-AGX edge platform.

artificial intelligence, machine learning, model size, (15 more...)

arXiv.org Artificial Intelligence

2404.10689

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

AI Competitions and Benchmarks: Dataset Development

Egele, Romain, Junior, Julio C. S. Jacques, van Rijn, Jan N., Guyon, Isabelle, Baró, Xavier, Clapés, Albert, Balaprakash, Prasanna, Escalera, Sergio, Moeslund, Thomas, Wan, Jun

arXiv.org Machine LearningApr-15-2024

Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual data preparation. The haste in developing new models can frequently result in various shortcomings, potentially posing risks when deployed in real-world scenarios (e.g., social discrimination, critical failures), leading to the failure or substantial escalation of costs in AI-based projects. This chapter provides a comprehensive overview of established methodological tools, enriched by our practical experience, in the development of datasets for machine learning. Initially, we develop the tasks involved in dataset development and offer insights into their effective management (including requirements, design, implementation, evaluation, distribution, and maintenance). Then, we provide more details about the implementation process which includes data collection, transformation, and quality evaluation. Finally, we address practical considerations regarding dataset distribution and maintenance.

data mining, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2404.09703

Country:

Europe (1.00)
North America > United States > Wisconsin (0.14)
North America > United States > New York (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(4 more...)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(6 more...)

Add feedback

Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization Approach

Sun, Yixuan, Sowunmi, Ololade, Egele, Romain, Narayanan, Sri Hari Krishna, Van Roekel, Luke, Balaprakash, Prasanna

arXiv.org Machine LearningApr-10-2024

Training an effective deep learning model to learn ocean processes involves careful choices of various hyperparameters. We leverage the advanced search algorithms for multiobjective optimization in DeepHyper, a scalable hyperparameter optimization software, to streamline the development of neural networks tailored for ocean modeling. The focus is on optimizing Fourier neural operators (FNOs), a data-driven model capable of simulating complex ocean behaviors. Selecting the correct model and tuning the hyperparameters are challenging tasks, requiring much effort to ensure model accuracy. DeepHyper allows efficient exploration of hyperparameters associated with data preprocessing, FNO architecture-related hyperparameters, and various model training strategies. We aim to obtain an optimal set of hyperparameters leading to the most performant model. Moreover, on top of the commonly used mean squared error for model training, we propose adopting the negative anomaly correlation coefficient as the additional loss term to improve model performance and investigate the potential trade-off between the two terms. The experimental results show that the optimal set of hyperparameters enhanced model performance in single timestepping forecasting and greatly exceeded the baseline configuration in the autoregressive rollout for long-horizon forecasting up to 30 days. Utilizing DeepHyper, we demonstrate an approach to enhance the use of FNOs in ocean dynamics forecasting, offering a scalable solution with improved precision.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

2404.05768

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback