pacific northwest national laboratory
Stabilizing PDE--ML coupled systems
Qadeer, Saad, Stinis, Panos, Wan, Hui.
Partial differential equations (PDEs) are an essential modeling tool in engineering and physical sciences. The numerical methods used for solving the more descriptive and sophisticated of these models comprise many computationally expensive modules. Machine learning (ML) provides a way of replacing some of these modules by surrogates that are much more efficient at the time of inference. The resulting PDE-ML coupled systems, however, can be highly susceptible to instabilities [1-3]. Efforts towards ameliorating these have mostly concentrated on improving the accuracy of the surrogates, imbuing them with additional structure, or introducing problem-specific stabilizers, and have garnered limited success [4-7]. In this article, we study a prototype problem to understand the mathematical subtleties involved in PDE-ML coupling, and draw insights that can help with more complex systems.
Emulating the Global Change Analysis Model with Deep Learning
Holmes, Andrew, Jensen, Matt, Coffland, Sarah, Shen, Hidemi Mitani, Sizemore, Logan, Bassetti, Seth, Nieva, Brenna, Tebaldi, Claudia, Snyder, Abigail, Hutchinson, Brian
The Global Change Analysis Model (GCAM) simulates complex interactions between the coupled Earth and human systems, providing valuable insights into the co-evolution of land, water, and energy sectors under different future scenarios. Understanding the sensitivities and drivers of this multisectoral system can lead to more robust understanding of the different pathways to particular outcomes. The interactions and complexity of the coupled human-Earth systems make GCAM simulations costly to run at scale - a requirement for large ensemble experiments which explore uncertainty in model parameters and outputs. A differentiable emulator with similar predictive power, but greater efficiency, could provide novel scenario discovery and analysis of GCAM and its outputs, requiring fewer runs of GCAM. As a first use case, we train a neural network on an existing large ensemble that explores a range of GCAM inputs related to different relative contributions of energy production sources, with a focus on wind and solar. We complement this existing ensemble with interpolated input values and a wider selection of outputs, predicting 22,528 GCAM outputs across time, sectors, and regions. We report a median $R^2$ score of 0.998 for the emulator's predictions and an $R^2$ score of 0.812 for its input-output sensitivity.
Scaffold-Based Multi-Objective Drug Candidate Optimization
Kruel, Agustin, McNaughton, Andrew D., Kumar, Neeraj
In therapeutic design, balancing various physiochemical properties is crucial for molecule development, similar to how Multiparameter Optimization (MPO) evaluates multiple variables to meet a primary goal. While many molecular features can now be predicted using \textit{in silico} methods, aiding early drug development, the vast data generated from high throughput virtual screening challenges the practicality of traditional MPO approaches. Addressing this, we introduce a scaffold focused graph-based Markov chain Monte Carlo framework (ScaMARS) built to generate molecules with optimal properties. This innovative framework is capable of self-training and handling a wider array of properties, sampling different chemical spaces according to the starting scaffold. The benchmark analysis on several properties shows that ScaMARS has a diversity score of 84.6\% and has a much higher success rate of 99.5\% compared to conditional models. The integration of new features into MPO significantly enhances its adaptability and effectiveness in therapeutic design, facilitating the discovery of candidates that efficiently optimize multiple properties.
Cybersecurity Defenders Are Expanding Their AI Toolbox - Technology Org
Cybersecurity scientists have taken a key step toward harnessing a form of artificial intelligence known as deep reinforcement learning, or DRL, to protect computer networks. When faced with sophisticated cyberattacks in a rigorous simulation setting, deep reinforcement learning was effective at stopping adversaries from reaching their goals up to 95 percent of the time. The outcome offers promise for a role for autonomous AI in proactive cyber defense. Scientists from the Department of Energy's Pacific Northwest National Laboratory documented their findings in a research paper and presented their work Feb. 14 at a workshop on AI for Cybersecurity during the annual meeting of the Association for the Advancement of Artificial Intelligence in Washington, D.C. The starting point was the development of a simulation environment to test multistage attack scenarios involving distinct types of adversaries.
How Big Data Carried Graph Theory Into New Dimensions
The mathematical language for talking about connections, which usually depends on networks--vertices (dots) and edges (lines connecting them)--has been an invaluable way to model real-world phenomena since at least the 18th century. But a few decades ago, the emergence of giant data sets forced researchers to expand their toolboxes and, at the same time, gave them sprawling sandboxes in which to apply new mathematical insights. Since then, said Josh Grochow, a computer scientist at the University of Colorado, Boulder, there's been an exciting period of rapid growth as researchers have developed new kinds of network models that can find complex structures and signals in the noise of big data. Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research develop ments and trends in mathe matics and the physical and life sciences. Grochow is among a growing chorus of researchers who point out that when it comes to finding connections in big data, graph theory has its limits.
New machine learning tool tracks urban traffic congestion
This display was computed in less than one hour.... view more A new machine learning algorithm is poised to help urban transportation analysts relieve bottlenecks and chokepoints that routinely snarl city traffic. The tool, called TranSEC, was developed at the U.S. Department of Energy's Pacific Northwest National Laboratory to help urban traffic engineers get access to actionable information about traffic patterns in their cities. WATCH: https://www.youtube.com/watch?v 8S4bLv9CtOo (Video by Graham Bourque Pacific Northwest National Laboratory) Currently, publicly available traffic information at the street level is sparse and incomplete. Traffic engineers generally have relied on isolated traffic counts, collision statistics and speed data to determine roadway conditions. The new tool uses traffic datasets collected from UBER drivers and other publicly available traffic sensor data to map street-level traffic flow over time.
Senior Data Engineer 3 - Machine Learning and Cyber in RICHLAND, Washington, United States
Do you want to create a legacy of meaningful research for the greater good? Do you want to lead and contribute to work in support of an organization that addresses some of today's most challenging problems that face our Nation? Then join us in the Data Sciences and Analytics Group at the Pacific Northwest National Laboratory (PNNL)! For more than 50 years, PNNL has advanced the frontiers of science and engineering in the service of our nation and the world in the areas of energy, the environment and national security. PNNL is committed to advancing the state-of-the-art in artificial intelligence through applied machine learning and deep learning to support scientific discovery and our sponsors' missions.
Scaling the training of particle classification on simulated MicroBooNE events to multiple GPUs
Hagen, Alex, Church, Eric, Strube, Jan, Bhattacharya, Kolahal, Amatya, Vinay
Measurements in Liquid Argon Time Projection Chamber (LArTPC) neutrino detectors, such as the MicroBooNE detector at Fermilab, feature large, high fidelity event images. Deep learning techniques have been extremely successful in classification tasks of photographs, but their application to LArTPC event images is challenging, due to the large size of the events. Events in these detectors are typically two orders of magnitude larger than images found in classical challenges, like recognition of handwritten digits contained in the MNIST database or object recognition in the ImageNet database. Ideally, training would occur on many instances of the entire event data, instead of many instances of cropped regions of interest from the event data. However, such efforts lead to extremely long training cycles, which slow down the exploration of new network architectures and hyperparameter scans to improve the classification performance. We present studies of scaling a LArTPC classification problem on multiple architectures, spanning multiple nodes. The studies are carried out on simulated events in the MicroBooNE detector. We emphasize that it is beyond the scope of this study to optimize networks or extract the physics from any results here. Institutional computing at Pacific Northwest National Laboratory and the SummitDev machine at Oak Ridge National Laboratory's Leadership Computing Facility have been used. To our knowledge, this is the first use of state-of-the-art Convolutional Neural Networks for particle physics and their attendant compute techniques onto the DOE Leadership Class Facilities. We expect benefits to accrue particularly to the Deep Underground Neutrino Experiment (DUNE) LArTPC program, the flagship US High Energy Physics (HEP) program for the coming decades.
Street-level Travel-time Estimation via Aggregated Uber Data
Maass, Kelsey, Sathanur, Arun V, Khan, Arif, Rallo, Robert
Estimating temporal patterns in travel times along road segments in urban settings is of central importance to traffic engineers and city planners. In this work, we propose a methodology to leverage coarse-grained and aggregated travel time data to estimate the street-level travel times of a given metropolitan area. Our main focus is to estimate travel times along the arterial road segments where relevant data are often unavailable. The central idea of our approach is to leverage easy-to-obtain, aggregated data sets with broad spatial coverage, such as the data published by Uber Movement, as the fabric over which other expensive, fine-grained datasets, such as loop counter and probe data, can be overlaid. Our proposed methodology uses a graph representation of the road network and combines several techniques such as graph-based routing, trip sampling, graph sparsification, and least-squares optimization to estimate the street-level travel times. Using sampled trips and weighted shortest-path routing, we iteratively solve constrained least-squares problems to obtain the travel time estimates. We demonstrate our method on the Los Angeles metropolitan-area street network, where aggregated travel time data is available for trips between traffic analysis zones. Additionally, we present techniques to scale our approach via a novel graph pseudo-sparsification technique.
How does AI improve grid performance? No one fully understands and that's limiting its use
Just as power system operators are mastering data analytics to optimize hardware efficiencies, they are discovering how the complexities of artificial intelligence tools can do far more, and how to choose which to use. With deployment of advanced metering infrastructure (AMI) and smart sensor-equipped hardware, system operators are capturing unprecedented levels of data. Cloud computing and massive computational capabilities are allowing data analytics to make these investments pay off for customers. But it may take machine learning (ML) and artificial intelligence (AI) to address new power grid complexities. AI is a form of computer science that would make power system management fully autonomous in real time, researchers and private sector providers of power system services told Utility Dive.