Plotting

 Shankar, Mallikarjun


Scalable Artificial Intelligence for Science: Perspectives, Methods and Exemplars

arXiv.org Artificial Intelligence

In a post-ChatGPT world, this paper explores the potential of leveraging scalable artificial intelligence for scientific discovery. We propose that scaling up artificial intelligence on high-performance computing platforms is essential to address such complex problems. This perspective focuses on scientific use cases like cognitive simulations, large language models for scientific inquiry, medical image analysis, and physics-informed approaches. The study outlines the methodologies needed to address such challenges at scale on supercomputers or the cloud and provides exemplars of such approaches applied to solve a variety of scientific problems. In light of ChatGPT's growing popularity, the transformative potential of AI in science becomes increasingly evident. Although a number of recent articles highlight the transformative power of AI in science [1, 2, 3], few provide specifics how to implement such methods at scale on supercomputers. Using ChatGPT as an archetype, we argue that the success of such complex AI models results from two primary advancements: (1) the development of the transformer architecture, (2) the ability to train on vast amounts of internet-scale data. This process represents a broader trend within the field of AI where combining massive amounts of training data with large-scale computational resources becomes the foundation of scientific breakthroughs. Several examples underscore the integral role of using large-scale computational resources and colossal amounts of data to achieve scientific breakthroughs. For instance, Khan et al. [4] used AI and large-scale computing for advanced models of black hole mergers, leveraging a dataset of 14 million waveforms on the Summit supercomputer. Riley et al. [5] made significant progress towards the understanding the physics of stratified fluid turbulence by being able to model the Prandtl number of seven, which represents ocean water at 20 Such simulations required being simulated using four trillion grid points, which required petabytes of storage [6].


Zero Coordinate Shift: Whetted Automatic Differentiation for Physics-informed Operator Learning

arXiv.org Artificial Intelligence

Automatic differentiation (AD) is a critical step in physics-informed machine learning, required for computing the high-order derivatives of network output w.r.t. coordinates of collocation points. In this paper, we present a novel and lightweight algorithm to conduct AD for physics-informed operator learning, which we call the trick of Zero Coordinate Shift (ZCS). Instead of making all sampled coordinates as leaf variables, ZCS introduces only one scalar-valued leaf variable for each spatial or temporal dimension, simplifying the wanted derivatives from "many-roots-many-leaves" to "one-root-many-leaves" whereby reverse-mode AD becomes directly utilisable. It has led to an outstanding performance leap by avoiding the duplication of the computational graph along the dimension of functions (physical parameters). ZCS is easy to implement with current deep learning libraries; our own implementation is achieved by extending the DeepXDE package. We carry out a comprehensive benchmark analysis and several case studies, training physics-informed DeepONets to solve partial differential equations (PDEs) without data. The results show that ZCS has persistently reduced GPU memory consumption and wall time for training by an order of magnitude, and such reduction factor scales with the number of functions. As a low-level optimisation technique, ZCS imposes no restrictions on data, physics (PDE) or network architecture and does not compromise training results from any aspect.


DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

arXiv.org Artificial Intelligence

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research.