Goto

Collaborating Authors

 Rogers, David


The Artificial Scientist -- in-transit Machine Learning of Plasma Simulations

arXiv.org Artificial Intelligence

Increasing HPC cluster sizes and large-scale simulations that produce petabytes of data per run, create massive IO and storage challenges for analysis. Deep learning-based techniques, in particular, make use of these amounts of domain data to extract patterns that help build scientific understanding. Here, we demonstrate a streaming workflow in which simulation data is streamed directly to a machine-learning (ML) framework, circumventing the file system bottleneck. Data is transformed in transit, asynchronously to the simulation and the training of the model. With the presented workflow, data operations can be performed in common and easy-to-use programming languages, freeing the application user from adapting the application output routines. As a proof-of-concept we consider a GPU accelerated particle-in-cell (PIConGPU) simulation of the Kelvin- Helmholtz instability (KHI). We employ experience replay to avoid catastrophic forgetting in learning from this non-steady process in a continual manner. We detail challenges addressed while porting and scaling to Frontier exascale system.


Scalable Training of Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN

arXiv.org Artificial Intelligence

We present our work on developing and training scalable graph foundation models (GFM) using HydraGNN, a multi-headed graph convolutional neural network architecture. HydraGNN expands the boundaries of graph neural network (GNN) in both training scale and data diversity. It abstracts over message passing algorithms, allowing both reproduction of and comparison across algorithmic innovations that define convolution in GNNs. This work discusses a series of optimizations that have allowed scaling up the GFM training to tens of thousands of GPUs on datasets that consist of hundreds of millions of graphs. Our GFMs use multi-task learning (MTL) to simultaneously learn graph-level and node-level properties of atomistic structures, such as the total energy and atomic forces. Using over 150 million atomistic structures for training, we illustrate the performance of our approach along with the lessons learned on two United States Department of Energy (US-DOE) supercomputers, namely the Perlmutter petascale system at the National Energy Research Scientific Computing Center and the Frontier exascale system at Oak Ridge National Laboratory. The HydraGNN architecture enables the GFM to achieve near-linear strong scaling performance using more than 2,000 GPUs on Perlmutter and 16,000 GPUs on Frontier. Hyperparameter optimization (HPO) was performed on over 64,000 GPUs on Frontier to select GFM architectures with high accuracy. Early stopping was applied on each GFM architecture for energy awareness in performing such an extreme-scale task. The training of an ensemble of highest-ranked GFM architectures continued until convergence to establish uncertainty quantification (UQ) capabilities with ensemble learning. Our contribution opens the door for rapidly developing, training, and deploying GFMs using large-scale computational resources to enable AI-accelerated materials discovery and design.


Data Analysis using G/SPLINES

Neural Information Processing Systems

G/SPLINES is an algorithm for building functional models of data. It uses genetic search to discover combinations of basis functions which are then used to build a least-squares regression model. Because it produces a population of models which evolve over time rather than a single model, it allows analysis not possible with other regression-based approaches. 1 INTRODUCTION G/SPLINES is a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm (Friedman, 1990) with Holland's Genetic Algorithm (Holland, 1975). G/SPLINES has advantages over MARS in that it requires fewer least-squares computations, is easily extendable to non-spline basis functions, may discover models inaccessible to local-variable selection algorithms, and allows significantly larger problems to be considered. These issues are discussed in (Rogers, 1991). This paper begins with a discussion of linear regression models, followed by a description of the G/SPLINES algorithm, and finishes with a series of experiments illustrating its performance, robustness, and analysis capabilities.


Data Analysis using G/SPLINES

Neural Information Processing Systems

G/SPLINES is an algorithm for building functional models of data. It uses genetic search to discover combinations of basis functions which are then used to build a least-squares regression model. Because it produces a population of models which evolve over time rather than a single model, it allows analysis not possible with other regression-based approaches. 1 INTRODUCTION G/SPLINES is a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm (Friedman, 1990) with Holland's Genetic Algorithm (Holland, 1975). G/SPLINES has advantages over MARS in that it requires fewer least-squares computations, is easily extendable to non-spline basis functions, may discover models inaccessible to local-variable selection algorithms, and allows significantly larger problems to be considered. These issues are discussed in (Rogers, 1991). This paper begins with a discussion of linear regression models, followed by a description of the G/SPLINES algorithm, and finishes with a series of experiments illustrating its performance, robustness, and analysis capabilities.


Predicting Weather Using a Genetic Memory: A Combination of Kanerva's Sparse Distributed Memory with Holland's Genetic Algorithms

Neural Information Processing Systems

Kanerva's sparse distributed memory (SDM) is an associative-memory modelbased on the mathematical properties of high-dimensional binary address spaces. Holland's genetic algorithms are a search technique forhigh-dimensional spaces inspired by evolutionary processes of DNA. "Genetic Memory" is a hybrid of the above two systems, in which the memory uses a genetic algorithm to dynamically reconfigure itsphysical storage locations to reflect correlations between the stored addresses and data. For example, when presented with raw weather station data, the Genetic Memory discovers specific features inthe weather data which correlate well with upcoming rain, and reconfigures the memory to utilize this information effectively. This architecture is designed to maximize the ability of the system to scale-up to handle real-world problems.


Predicting Weather Using a Genetic Memory: A Combination of Kanerva's Sparse Distributed Memory with Holland's Genetic Algorithms

Neural Information Processing Systems

Kanerva's sparse distributed memory (SDM) is an associative-memory model based on the mathematical properties of high-dimensional binary address spaces. Holland's genetic algorithms are a search technique for high-dimensional spaces inspired by evolutionary processes of DNA. "Genetic Memory" is a hybrid of the above two systems, in which the memory uses a genetic algorithm to dynamically reconfigure its physical storage locations to reflect correlations between the stored addresses and data. For example, when presented with raw weather station data, the Genetic Memory discovers specific features in the weather data which correlate well with upcoming rain, and reconfigures the memory to utilize this information effectively. This architecture is designed to maximize the ability of the system to scale-up to handle real-world problems.


Statistical Prediction with Kanerva's Sparse Distributed Memory

Neural Information Processing Systems

ABSTRACT A new viewpoint of the processing performed by Kanerva's sparse distributed memory (SDM) is presented. In conditions of near-or over-capacity, where the associative-memory behavior of the model breaks down, the processing performed by the model can be interpreted as that of a statistical predictor. Mathematical results are presented which serve as the framework for a new statistical viewpoint of sparse distributed memory and for which the standard formulation of SDM is a special case. This viewpoint suggests possible enhancements to the SDM model, including a procedure for improving the predictiveness of the system based on Holland's work with'Genetic Algorithms', and a method for improving the capacity of SDM even when used as an associative memory. OVERVIEW This work is the result of studies involving two seemingly separate topics that proved to share a common framework. The fIrst topic, statistical prediction, is the task of associating extremely large perceptual state vectors with future events.


Statistical Prediction with Kanerva's Sparse Distributed Memory

Neural Information Processing Systems

David Rogers Research Institute for Advanced Computer Science MS 230-5, NASA Ames Research Center Moffett Field, CA 94035 ABSTRACT A new viewpoint of the processing performed by Kanerva's sparse distributed memory (SDM) is presented. In conditions of near-or over-capacity, where the associative-memory behavior of the model breaksdown, the processing performed by the model can be interpreted asthat of a statistical predictor. Mathematical results are presented which serve as the framework for a new statistical viewpoint ofsparse distributed memory and for which the standard formulation ofSDM is a special case. This viewpoint suggests possible enhancements to the SDM model, including a procedure for improving the predictiveness of the system based on Holland's work with'Genetic Algorithms', and a method for improving the capacity of SDM even when used as an associative memory. OVERVIEW This work is the result of studies involving two seemingly separate topics that proved to share a common framework.